Releases: apache/skywalking
v10.1.0
Download
https://skywalking.apache.org/downloads/
Notice
Don't download source codes from this page.
Please follow build document, if you want to build source codes by yourself.
A Version of PERFORMANCE
- Huge UI Performance Improvement. Metrics widgets queries are bundled by leveraging the GraphQL capabilities.
- Parallel Queries Support in GraphQL engine. Improve query performance.
- Significantly improve the performance of OTEL metrics handler. Reduce CPU and GC costs in OTEL metrics processes.
- With adopting BanyanDB 0.7, native database performance and stability are improved.
Project
- E2E: bump up the version of the opentelemetry-collector to 0.102.1.
- Push snapshot data-generator docker image to ghcr.io.
- Bump up skywalking-infra-e2e to work around GHA removing
docker-compose
v1. - Bump up CodeQL GitHub Actions.
- Fix wrong phase of delombok plugin to reduce build warnings.
- Use ci-friendly revision to set the project version.
OAP Server
- Fix wrong indices in the eBPF Profiling related models.
- Support exclude the specific namespaces traffic in the eBPF Access Log receiver.
- Add Golang as a supported language for Elasticsearch.
- Remove unnecessary BanyanDB flushing logs(info).
- Increase
SW_CORE_GRPC_MAX_MESSAGE_SIZE
to 50MB. - Support to query relation metrics through PromQL.
- Support trace MQE query for debugging.
- Add Component ID(158) for the Solon framework.
- Fix metrics tag in HTTP handler of browser receiver plugin.
- Increase
alarm_record#message
column length to 2000 from 200. - Remove
alarm_record#message
column indexing. - Add Python as a supported language for Pulsar.
- Make more proper histogram buckets for the
persistence_timer_bulk_prepare_latency
,
persistence_timer_bulk_execute_latency
andpersistence_timer_bulk_all_latency
metrics in PersistenceTimer. - [Break Change] Update Nacos version to 2.3.2. Nacos 1.x server can't serve as cluster coordinator and configuration server.
- Support tracing trace query(SkyWalking and Zipkin) for debugging.
- Fix BanyanDB metrics query: used the wrong
Downsampling
type to find the schema. - Support fetch cilium flow to monitoring network traffic between cilium services.
- Support
labelCount
function in the OAL engine. - Support BanyanDB internal measure query execution tracing.
- BanyanDB client config: rise the default
maxBulkSize
to 10000, addflushTimeout
and set default to 10s. - Polish BanyanDB group and schema creation logic to fix the schema creation failure issue in distributed race conditions.
- Support tracing topology query for debugging.
- Fix expression of graph
Current QPS
in MySQL dashboard. - Support tracing logs query for debugging.
- BanyanDB: fix Tag autocomplete data storage and query.
- Support aggregation operators in PromQL query.
- Update the kubernetes HTTP latency related metrics source unit from
ns
toms
. - Support BanyanDB internal stream query execution tracing.
- Fix Elasticsearch, MySQL, RabbitMQ dashboards typos and missing expressions.
- BanyanDB: Zipkin Module set service as Entity for improving the query performance.
- MQE: check the metrics value before do binary operation to improve robustness.
- Replace workaround with Armeria native supported context path.
- Add an http endpoint wrapper for health check.
- Bump up Armeria and transitive dependencies.
- BanyanDB: if the model column is already a
@BanyanDB.TimestampColumn
, set@BanyanDB.NoIndexing
on it to reduce indexes. - BanyanDB: stream sort-by
time
query, use internal time-series rather thanindex
to improve the query performance. - Bump up graphql-java to 21.5.
- Add Unknown Node when receive Kubernetes peer address is not aware in current cluster.
- Fix CounterWindow concurrent increase cause NPE by PriorityQueue
- Fix format the endpoint name with empty string.
- Support async query for the composite GraphQL query.
- Get endpoint list order by timestamp desc.
- Support sort queries on metrics generated by eBPF receiver.
- Fix the compatibility with Grafana 11 when using label_values query variables.
- Nacos as config server and cluster coordinator supports configuration contextPath.
- Update the endpoint name format to
<Method>:<Path>
in eBPF Access Log Receiver. - Add self-observability metrics for OpenTelemetry receiver.
- Support service level metrics aggregate when missing pod context in eBPF Access Log Receiver.
- Fix query
getGlobalTopology
throw exception when didn't find any services by the given Layer. - Fix the previous analysis result missing in the ALS
k8s-mesh
analyzer. - Fix
findEndpoint
query requireskeyword
when using BanyanDB. - Support to analysis the ztunnel mapped IP address and mTLS mode in eBPF Access Log Receiver.
- Adapt BanyanDB Java Client 0.7.0.
- Add SkyWalking Java Agent self observability dashboard.
- Add Component ID(5022) for the GoFrame framework.
- Bump up protobuf java dependencies to 3.25.5.
- BanyanDB: support using native term searching for
keyword
in queryfindEndpoint
andgetAlarm
. - BanyanDB: support TLS connection and configuration.
- PromQL service: query API support RFC3399 time format.
- Improve the performance of OTEL metrics handler.
- PromQL service: fix operators result missing
rangeExpression
flag. - BanyanDB: use
TimestampRange
to improve "events" query for BanyanDB. - Optimize
network_address_alias
table to reduce the number of the index. - PromQL service: support round brackets operator.
- Support query Alarm message Tag for auto-complete.
UI
- Highlight search log keywords.
- Add Error URL in the browser log.
- Add a SolonMVC icon.
- Adding cilium icon and i18n for menu.
- Fix the mismatch between the unit and calculation of the "Network Bandwidth Usage" widget in Windows-Service Dashboard.
- Make a maximum 20 entities per query in service/instance/endpoint list widgets.
- Polish error nodes in trace widget.
- Introduce flame graph to the trace profiling.
- Correct services and instances when changing page numbers.
- Improve metric queries to make page opening brisker.
- Bump up dependencies to fix CVEs.
- Add a loading view for initialization page.
- Fix a bug for selectors when clicking the refresh icon.
- Fix health check to OAP backend.
- Add
Service
,ServiceInstance
,Endpoint
dashboard forwarder to Kubernetes Topologies. - Fix pagination for service/instance list widgets.
- Add queries for alarm tags.
- Add skywalking java agent self observability menu.
Documentation
- Update the version description supported by zabbix receiver.
- Move the Official Dashboard docs to marketplace docs.
- Add marketplace introduction docs under
quick start
menu to reduce the confusion of finding feature docs. - Update Windows Metrics(Swap -> Virtual Memory)
New Contributors
- @xsShuang made their first contribution in #12347
- @friendlytkyj made their first contribution in #12363
- @shalk made their first contribution in #12362
- @kael-aiur made their first contribution in #12505
- @wetool19 made their first contribution in #12558
- @huicunjun made their first contribution in #12624
All issues and pull requests are here
10.0.1
Download
https://skywalking.apache.org/downloads/
Notice
Don't download source codes from this page.
Please follow build document, if you want to build source codes by yourself.
Project
- Add SBOM (Software Bill of Materials) to the project.
OAP Server
- Fix LAL test query api.
- Add component libraries of
Derby
/Sybase
/SQLite
/DB2
/OceanBase
jdbc driver. - Fix setting the wrong interval to day level measure schema in BanyanDB installation process.
UI
- Fix widget title and tips.
- Fix statistics span data.
- Fix browser log display.
- Fix the topology layout for there are multiple independent network relationships.
All issues and pull requests are here
10.0.0
Download
https://skywalking.apache.org/downloads/
Notice
Don't download source codes from this page.
Please follow build document, if you want to build source codes by yourself.
Service Hierarchy
Service Hierarchy | Hierarchy Graph |
---|---|
Run with BanyanDB 0.6 in the Cluster Mode
Project
- Support Java 21 runtime.
- Support oap-java21 image for Java 21 runtime.
- Upgrade
OTEL collector
version to0.92.0
in all e2e tests. - Switch CI macOS runner to m1.
- Upgrade PostgreSQL driver to
42.4.4
to fix CVE-2024-1597. - Remove CLI(
swctl
) from the image. - Remove CLI_VERSION variable from Makefile build.
- Add BanyanDB to docker-compose quickstart.
- Bump up Armeria, jackson, netty, jetcd and grpc to fix CVEs.
- Bump up BanyanDB Java Client to 0.6.0.
OAP Server
- Add
layer
parameter to the global topology graphQL query. - Add
is_present
function in MQE for check if the list metrics has a value or not. - Remove unreasonable default configurations for gRPC thread executor.
- Remove
gRPCThreadPoolQueueSize (SW_RECEIVER_GRPC_POOL_QUEUE_SIZE)
configuration. - Allow excluding ServiceEntries in some namespaces when looking up ServiceEntries as a final resolution method of
service metadata. - Set up the length of source and dest IDs in relation entities of service, instance, endpoint, and process to 250(was
200). - Support build Service/Instance Hierarchy and query.
- Change the string field in Elasticsearch storage from keyword type to text type if it set more than
32766
length. - [Break Change] Change the configuration field of
ui_template
andui_menu
in Elasticsearch storage from keyword type to text. - Support Service Hierarchy auto matching, add auto matching layer relationships (upper -> lower) as following:
- MESH -> MESH_DP
- MESH -> K8S_SERVICE
- MESH_DP -> K8S_SERVICE
- GENERAL -> K8S_SERVICE
- Add
namespace
suffix forK8S_SERVICE_NAME_RULE/ISTIO_SERVICE_NAME_RULE
andmetadata-service-mapping.yaml
as default. - Allow using a dedicated port for ALS receiver.
- Fix log query by traceId in
JDBCLogQueryDAO
. - Support handler eBPF access log protocol.
- Fix SumPerMinFunctionTest error function.
- Remove unnecessary annotations and functions from Meter Functions.
- Add
max
andmin
functions for MAL down sampling. - Fix critical bug of uncontrolled memory cost of TopN statistics. Change topN group key from
StorageId
toentityId + timeBucket
. - Add Service Hierarchy auto matching layer relationships (upper -> lower) as following:
- MYSQL -> K8S_SERVICE
- POSTGRESQL -> K8S_SERVICE
- SO11Y_OAP -> K8S_SERVICE
- VIRTUAL_DATABASE -> MYSQL
- VIRTUAL_DATABASE -> POSTGRESQL
- Add Golang as a supported language for AMQP.
- Support available layers of service in the topology.
- Add
count
aggregation function for MAL - Add Service Hierarchy auto matching layer relationships (upper -> lower) as following:
- NGINX -> K8S_SERVICE
- APISIX -> K8S_SERVICE
- GENERAL -> APISIX
- Add Golang as a supported language for RocketMQ.
- Support Apache RocketMQ server monitoring.
- Add Service Hierarchy auto matching layer relationships (upper -> lower) as following:
- ROCKETMQ -> K8S_SERVICE
- VIRTUAL_MQ -> ROCKETMQ
- Fix ServiceInstance
in
query. - Mock
/api/v1/status/buildinfo
for PromQL API. - Fix table exists check in the JDBC Storage Plugin.
- Fix day-based table rolling time range strategy in JDBC storage.
- Add
maxInboundMessageSize (SW_DCS_MAX_INBOUND_MESSAGE_SIZE)
configuration to change the max inbound message size of DCS. - Fix Service Layer when building Events in the EventHookCallback.
- Add Golang as a supported language for Pulsar.
- Add Service Hierarchy auto matching layer relationships (upper -> lower) as following:
- RABBITMQ -> K8S_SERVICE
- VIRTUAL_MQ -> RABBITMQ
- Remove Column#function mechanism in the kernel.
- Make query
readMetricValue
always return the average value of the duration. - Add Service Hierarchy auto matching layer relationships (upper -> lower) as following:
- KAFKA -> K8S_SERVICE
- VIRTUAL_MQ -> KAFKA
- Support ClickHouse server monitoring.
- Add Service Hierarchy auto matching layer relationships (upper -> lower) as following:
- CLICKHOUSE -> K8S_SERVICE
- VIRTUAL_DATABASE -> CLICKHOUSE
- Add Service Hierarchy auto matching layer relationships (upper -> lower) as following:
- PULSAR -> K8S_SERVICE
- VIRTUAL_MQ -> PULSAR
- Add Golang as a supported language for Kafka.
- Support displaying the port services listen to from OAP and UI during server start.
- Refactor data-generator to support generating metrics.
- Fix
AvgHistogramPercentileFunction
legacy name. - [Break Change] Labeled Metrics support multiple labels.
- Storage: store all label names and values instead of only the values.
- MQE:
- Support querying by multiple labels(name and value) instead using
_
as the anonymous label name. aggregate_labels
function support aggregate by specific labels.relabels
function require target label and rename label name and value.
- Support querying by multiple labels(name and value) instead using
- PromQL:
- Support querying by multiple labels(name and value) instead using
lables
as the anonymous label name. - Remove general labels
labels/relabels/label
function. - API
/api/v1/labels
and/api/v1/label/<label_name>/values
support return matched metrics labels.
- Support querying by multiple labels(name and value) instead using
- OAL:
- Deprecate
percentile
function and introducepercentile2
function instead.
- Deprecate
- Bump up Kafka to fix CVE.
- Fix
NullPointerException
in Istio ServiceEntry registry. - Remove unnecessary
componentIds
as series ID in theServiceRelationClientSideMetrics
andServiceRelationServerSideMetrics
entities. - Fix not throw error when part of expression not matched any expression node in the
MQE
and `PromQL. - Remove
kafka-fetcher/default/createTopicIfNotExist
as the creation is automatically since #7326 (v8.7.0). - Fix inaccuracy nginx service metrics.
- Fix/Change Windows metrics name(Swap -> Virtual Memory)
memory_swap_free
->memory_virtual_memory_free
memory_swap_total
->memory_virtual_memory_total
memory_swap_percentage
->memory_virtual_memory_percentage
- Fix/Change UI init setting for Windows Swap -> Virtual Memory
- Fix
Memory Swap Usage
/Virtual Memory Usage
display with UI init.(Linux/Windows) - Fix inaccurate APISIX metrics.
- Fix inaccurate MongoDB Metrics.
- Support Apache ActiveMQ server monitoring.
- Add Service Hierarchy auto matching layer relationships (upper -> lower) as following:
- ACTIVEMQ -> K8S_SERVICE
- Calculate Nginx service HTTP Latency by MQE.
- MQE query: make metadata not return
null
. - MQE labeled metrics Binary Operation: return empty value if the labels not match rather than report error.
- Fix inaccurate Hierarchy of RabbitMQ Server monitoring metrics.
- Fix inaccurate MySQL/MariaDB, Redis, PostgreSQL metrics.
- Support DoubleValue,IntValue,BoolValue in OTEL metrics attributes.
- [Break Change] gGRPC metrics exporter unified the metric value type and support labeled metrics.
- Add component definition(ID=152) for
c3p0
(JDBC3 Connection and Statement Pooling). - Fix MQE
top_n
global query. - Fix inaccurate Pulsar and Bookkeeper metrics.
- MQE support
sort_values
andsort_label_values
functions.
UI
- Fix the mismatch between the unit and calculation of the "Network Bandwidth Usage" widget in Linux-Service Dashboard.
- Add theme change animation.
- Implement the Service and Instance hierarchy topology.
- Support Tabs in the widget visible when MQE expressions.
- Support search on Marketplace.
- Fix default route.
- Fix layout on the Log widget.
- Fix Trace associates with Log widget.
- Add isDefault to the dashboard configuration.
- Add expressions to dashboard configurations on the dashboard list page.
- Update Kubernetes related UI templates for adapt data from eBPF access log.
- Fix dashboard
K8S-Service-Root
metrics expression. - Add dashboards for Service/Instance Hierarchy.
- Fix MQE in dashboards when using
Card widget
. - Optimize tooltips style.
- Fix resizing window causes the trace graph to display incorrectly.
- Add the not found page(404).
- Enhance VNode logic and support multiple Trace IDs in span's ref.
- Add the layers filed and associate layers dashboards for the service topology nodes.
- Fix
Nginx-Instance
metrics to instance level. - Update tabs of the Kubernetes service page.
- Add Airflow menu i18n.
- Add Support for dragging in the trace panel.
- Add workflow icon.
- Metrics support multiple labels.
- Support the
SINGLE_VALUE
for table widgets. - Remove the General metric mode and related logical code.
- Remove metrics for unreal nodes in the topology.
- Enhance the Trace widget for batch consuming spans.
- Clean the unused elements in the UI-templates.
Documentation
- Update the release doc to remove the announcement as the tests are through e2e rather than manually.
- Update the release notification mail a little.
- Polish docs structure. Move customization docs separately from the introduction docs.
- Add webhook/gRPC hooks settings example for
backend-alarm.md
. - Begin the process of
SWIP - SkyWalking Improvement Proposal
. - Add `SWIP-1 Create and det...
9.7.0
Download
https://skywalking.apache.org/downloads/
Notice
Don't download source codes from this page.
Please follow build document, if you want to build source codes by yourself.
Dark Mode
The dafult style mode is changed to the dark mode, and light mode is still available.
New Design Log View
A new design for the log view is currently available. Easier to locate the logs, and more space for the raw text.
Project
- Bump Java agent to 9.1-dev in the e2e tests.
- Bump up netty to 4.1.100.
- Update Groovy 3 to 4.0.15.
- Support packaging the project in JDK21. Compiler source and target remain in JDK11.
OAP Server
- ElasticSearchClient: Add
deleteById
API. - Fix Custom alarm rules are overwritten by 'resource/alarm-settings.yml'
- Support Kafka Monitoring.
- Support Pulsar server and BookKeeper server Monitoring.
- [Breaking Change] Elasticsearch storage merge all management data indices into one index
management
,
includingui_templateοΌui_menuοΌcontinuous_profiling_policy
. - Add a release mechanism for alarm windows when it is expired in case of OOM.
- Fix Zipkin trace receiver response: make the HTTP status code from
200
to202
. - Update BanyanDB Java Client to 0.5.0.
- Fix getInstances query in the BanyanDB Metadata DAO.
- BanyanDBStorageClient: Add
keepAliveProperty
API. - Fix table exists check in the JDBC Storage Plugin.
- Enhance extensibility of HTTP Server library.
- Adjust
AlarmRecord
alarmMessage column length to 512. - Fix
EventHookCallback
build event: build the layer fromService's Layer
. - Fix
AlarmCore
doAlarm: catch exception for each callback to avoid interruption. - Optimize queryBasicTraces in TraceQueryEsDAO.
- Fix
WebhookCallback
send incorrect messages, add catch exception for each callback HTTP Post. - Fix AlarmRule expression validation: add labeled metrics mock data for check.
- Support collect ZGC memory pool metrics.
- Add a component ID for Netty-http (ID=151).
- Add a component ID for Fiber (ID=5021).
- BanyanDBStorageClient: Add
define(Property property, PropertyStore.Strategy strategy)
API. - Correct the file format and fix typos in the filenames for monitoring Kafka's e2e tests.
- Support extract timestamp from patterned datetime string in LAL.
- Support output key parameters in the booting logs.
- Fix cannot query zipkin traces with
annotationQuery
parameter in the JDBC related storage. - Fix
limit
doesn't work forfindEndpoint
API in ES storage. - Isolate MAL CounterWindow cache by metric name.
- Fix JDBC Log query order.
- Change the DataCarrier IF_POSSIBLE strategy to use ArrayBlockingQueue implementation.
- Change the policy of the queue(DataCarrier) in the L1 metric aggregate worker to IF_POSSIBLE mode.
- Add self-observability metric
metrics_aggregator_abandon
to count the number of abandon metrics. - Support Nginx monitoring.
- Fix
BanyanDB Metadata Query
: make query single instance/process return full tags to avoid NPE. - Repleace go2sky E2E to GO agent.
- Replace Metrics v2 protocol with MQE in UI templates and E2E Test.
- Fix incorrect apisix metrics otel rules.
- Support
Scratch The OAP Config Dump
. - Support
increase/rate
function in theMQE
query language. - Group service endpoints into
_abandoned
when endpoints have high
cardinality.
UI
- Add new menu for kafka monitoring.
- Fix independent widget duration.
- Fix the display height of the link tree structure.
- Replace the name by shortName on service widget.
- Refactor: update pagination style. No visualization style change.
- Apply MQE on K8s layer UI-templates.
- Fix icons display in trace tree diagram.
- Fix: update tooltip style to support multiple metrics scrolling view in a metrics graph.
- Add a new widget to show jvm memory pool detail.
- Fix: avoid querying data with empty parameters.
- Add a title and a description for trace segments.
- Add Netty icon for Netty HTTP plugin.
- Add Pulsar menu i18n files.
- Refactor Logs view.
- Implement the Dark Theme.
- Change UI templates for Text widgets.
- Add Nginx menu i18n.
- Fix the height for trace widget.
- Polish list style.
- Fix Log associate with Trace.
- Enhance layout for broken Topology widget.
- Fix calls metric with call type for Topology widget.
- Fix changing metrics config for Topology widget.
- Fix routes for Tab widget.
- Remove OpenFunction(FAAS layer) relative UI templates and menu item.
- Fix: change colors to match dark theme for Network Profiling.
- Remove the description of OpenFunction in the UI i18n.
- Reduce component chunks to improve page loading resource time.
Documentation
- Separate storage docs to different files, and add an estimated timeline for BanyanDB(end of 2023).
- Add topology configuration in UI-Grafana doc.
- Add missing metrics to the
OpenTelemetry Metrics
doc. - Polish docs of
Concepts and Designs
. - Fix incorrect notes of slowCacheReadThreshold.
- Update OAP setup and cluster coordinator docs to explain new booting parameters table in the logs, and how to setup
cluster mode.
All issues and pull requests are here
9.6.0
Download
https://skywalking.apache.org/downloads/
Notice
Don't download source codes from this page.
Please follow build document, if you want to build source codes by yourself.
New Alerting Kernel
Support Loki LogQL
- Newly added support for Loki LogQL and Grafana Loki Dashboard for SkyWalking collected logs
WARNING
- ElasticSearch 6 storage relative tests are removed. It worked and is not promised due to end of life officially.
Project
- Bump up Guava to 32.0.1 to avoid the lib listed as vulnerable due to CVE-2020-8908. This API is never used.
- Maven artifact
skywalking-log-recevier-plugin
is renamed toskywalking-log-receiver-plugin
. - Bump up cli version 0.11 to 0.12.
- Bump up the version of ASF parent pom to v30.
- Make builds reproducible for automatic releases CI.
OAP Server
- Add Neo4j component ID(112) language: Python.
- Add Istio ServiceEntry registry to resolve unknown IPs in ALS.
- Wrap
deleteProperty
API to the BanyanDBStorageClient. - [Breaking change] Remove
matchedCounter
fromHttpUriRecognitionService#feedRawData
. - Remove patterns from
HttpUriRecognitionService#feedRawData
and add max 10 candidates of raw URIs for each pattern. - Add component ID for WebSphere.
- Fix AI Pipeline uri caching NullPointer and IllegalArgument Exceptions.
- Fix
NPE
in metrics query when the metric is not exist. - Remove E2E tests for Istio < 1.15, ElasticSearch < 7.16.3, they might still work but are not supported as planed.
- Scroll all results in ElasticSearch storage and refactor scrolling logics, including Service, Instance, Endpoint,
Process, etc. - Improve Kubernetes coordinator to remove
Terminating
OAP Pods in cluster. - Support
SW_CORE_SYNC_PERIOD_HTTP_URI_RECOGNITION_PATTERN
andSW_CORE_TRAINING_PERIOD_HTTP_URI_RECOGNITION_PATTERN
to control the period of training and sync HTTP URI recognition patterns. And shorten the default period to 10s for
sync and 60s for training. - Fix ElasticSearch scroller bug.
- Add component ID for Aerospike(ID=149).
- Packages with name
recevier
are renamed toreceiver
. BanyanDBMetricsDAO
handlesstoreIDTag
inmultiGet
forBanyanDBModelExtension
.- Fix endpoint grouping-related logic and enhance the performance of PatternTree retrieval.
- Fix metric session cache saving after batch insert when using
mysql-connector-java
. - Support dynamic UI menu query.
- Add comment for
docker/.env
to explain the usage. - Fix wrong environment variable name
SW_OTEL_RECEIVER_ENABLED_OTEL_RULES
to rightSW_OTEL_RECEIVER_ENABLED_OTEL_METRICS_RULES
. - Fix instance query in JDBC implementation.
- Set the
SW_QUERY_MAX_QUERY_COMPLEXITY
default value to 3000(was 1000). - Accept
length=4000
parameter value of the event. It was 2000. - Tolerate parameter value in illegal JSON format.
- Update BanyanDB Java Client to 0.4.0
- Support aggregate
Labeled Value Metrics
in MQE. - [Breaking change] Change the default label name in MQE from
labe
l to_
. - Bump up grpc version to 1.53.0.
- [Breaking change] Removed '&' symbols from shell scripts to avoid OAP server process running as a background process.
- Revert part of #10616 to fix the unexpected changes: if there is no data we should return an array with
0
s,
but in #10616, an empty array is returned. - Cache all service entity in memory for query.
- Bump up jackson version to 2.15.2.
- Increase the default memory size to avoid OOM.
- Bump up graphql-java to 21.0.
- Add Echo component ID(5015) language: Golang.
- Fix
index out of bounds exception
inaggregate_labels
MQE function. - Support MongoDB Server/Cluster monitoring powered by OTEL.
- Do not print configurations values in logs to avoid sensitive info leaked.
- Move created the latest index before retrieval indexes by aliases to avoid the 404 exception. This just prevents some interference from manual operations.
- Add more Go VM metrics, as new skywalking-go agent provided since its 0.2 release.
- Add component ID for Lock (ID=5016).
- [Breaking change] Adjust the structure of hooks in the
alarm-settings.yml
. Support multiple configs for each hook types and specifying the hooks in the alarm rule. - Bump up Armeria to 1.24.3.
- Fix BooleanMatch and BooleanNotEqualMatch doing Boolean comparison.
- Support LogQL HTTP query APIs.
- Add Mux Server component ID(5017) language: Golang.
- Remove ElasticSearch 6.3.2 from our client lib tests.
- Bump up ElasticSearch server 8.8.1 to 8.9.0 for latest e2e testing. 8.1.0, 7.16.3 and 7.17.10 are still tested.
- Add OpenSearch 2.8.0 to our client lib tests.
- Use listening mode for apollo implementation of dynamic configuration.
- Add
view_as_seq
function in MQE for listing metrics in the given prioritized sequence. - Fix the wrong default value of
k8sServiceNameRule
if it's not explicitly set. - Improve PromQL to allow for multiple metric operations within a single query.
- Fix MQE Binary Operation between labeled metrics and other type of value result.
- Add component ID for Nacos (ID=150).
- Support
Compare Operation
in MQE. - Fix the Kubernetes resource cache not refreshed.
- Fix wrong classpath that might cause OOM in startup.
- Enhance the
serviceRelation
in MAL by adding settings for thedelimiter
andcomponent
fields. - [Breaking change] Support MQE in the Alerting. The Alarm Rules configuration(alarm-settings.yml),
addexpression
field and removemetrics-name/count/threshold/op/only-as-condition
fields and removecomposite-rules
configuration. - Check results in ALS as per downstream/upstream instead of per log.
- Fix GraphQL query
listInstances
not using endTime query - Do not start server and Kafka consumer in init mode.
- Add Iris component ID(5018).
- Add OTLP Tracing support as a Zipkin trace input.
UI
- Fix metric name
browser_app_error_rate
inBrowser-Root
dashboard. - Fix display name of
endpoint_cpm
for endpoint list inGeneral-Service
dashboard. - Implement customize menus and marketplace page.
- Fix minTraceDuration and maxTraceDuration types.
- Fix init minTime to Infinity.
- Bump dependencies to fix vulnerabilities.
- Add scss variables.
- Fix the title of instance list and notices in the continue profiling.
- Add a link to explain the expression metric, add units in the continue profiling widget.
- Calculate string width to set Tabs name width.
- [Breaking change] Removed '&' symbols from shell scripts to avoid web application server process running as a background process.
- Reset chart label.
- Fix service associates instances.
- Remove node-sass.
- Fix commit error on Windows.
- Apply MQE on
MYSQL
,POSTGRESQL
,REDIS
,ELASTICSEARCH
andDYNAMODB
layer UI-templates. - Apply MQE on Virtual-Cache layer UI-templates
- Apply MQE on APISIX, AWS_EKS, AWS_GATEWAY and AWS_S3 layer UI templates.
- Apply MQE on RabbitMQ Dashboards.
- Apply MQE on Virtual-MQ layer UI-templates
- Apply MQE on Infra-Linux layer UI-templates
- Apply MQE on Infra-Windows layer UI-templates
- Apply MQE on Browser layer UI-templates.
- Implement MQE on topology widget.
- Fix getEndpoints keyword blank.
- Implement a breadcrumb component as navigation.
Documentation
- Add Go agent into the server agent documentation.
- Add data unit description in the configuration of continuous profiling policy.
- Remove
storage extension
doc, as it is expired. - Remove
how to add menu
doc, as SkyWalking supports marketplace and new backend-based setup. - Separate contribution docs to a new menu structure.
- Add a doc to explain how to manage i18n.
- Add a doc to explain OTLP Trace support.
- Fix typo in
dynamic-config-configmap.md
. - Fix out-dated docs about Kafka fetcher.
- Remove 3rd part fetchers from the docs, as they are not maintained anymore.
All issues and pull requests are here
9.5.0
Download
https://skywalking.apache.org/downloads/
Notice
Don't download source codes from this page.
Please follow build document, if you want to build source codes by yourself.
New Topology Layout
Elasticsearch Server Monitoring
Project
- Fix
Duplicate class found
due to thedelombok
goal.
OAP Server
- Fix wrong layer of metric
user error
in DynamoDB monitoring. - ElasticSearch storage does not check field types when OAP running in
no-init
mode. - Support to bind TLS status as a part of component for service topology.
- Fix component ID priority bug.
- Fix component ID of topology overlap due to storage layer bugs.
- [Breaking Change] Enhance JDBC storage through merging tables and managing day-based table rolling.
- [Breaking Change] Sharding-MySQL implementations and tests get removed due to we have the day-based rolling mechanism by default
- Fix otel k8s-cluster rule add namespace dimension for MAL aggregation calculation(Deployment Status,Deployment Spec Replicas)
- Support continuous profiling feature.
- Support collect process level related metrics.
- Fix K8sRetag reads the wrong k8s service from the cache due to a possible namespace mismatch.
- [Breaking Change] Support cross-thread trace profiling. The data structure and query APIs are changed.
- Fix PromQL HTTP API
/api/v1/labels
response missingservice
label. - Fix possible NPE when initialize
IntList
. - Support parse PromQL expression has empty labels in the braces for metadata query.
- Support alarm metric OP
!=
. - Support metrics query indicates whether value == 0 represents actually zero or no data.
- Fix
NPE
when query the not exist series indexes in ElasticSearch storage. - Support collecting memory buff/cache metrics in VM monitoring.
- PromQL: Remove empty values from the query result, fix
/api/v1/metadata
paramlimit
could cause out of bound. - Support monitoring the total number metrics of k8s StatefulSet and DaemonSet.
- Support Amazon API Gateway monitoring.
- Bump up graphql-java to fix cve.
- Bump up Kubernetes Java client.
- Support Redis Monitoring.
- Add component ID for amqp, amqp-producer and amqp-consumer.
- Support no-proxy mode for aws-firehose receiver
- Bump up armeria to 1.23.1
- Support Elasticsearch Monitoring.
- Fix PromQL HTTP API
/api/v1/series
response missingservice
label when matching metric. - Support ServerSide TopN for BanyanDB.
- Add component ID for Jersey.
- Remove OpenCensus support, the related codes and docs as it's sunsetting.
- Support dynamic configuration of searchableTracesTags
- Support
exportErrorStatusTraceOnly
for export the error status trace segments through the Kafka channel - Add component ID for Grizzly.
- Fix potential NPE in Zipkin receiver when the
Span
is missing some fields. - Filter out unknown_cluster metric data.
- Support RabbitMQ Monitoring.
- Support Redis slow logs collection.
- Fix data loss when query continuous profiling task record.
- Adapt the continuous profiling task query GraphQL.
- Support Metrics Query Expression(MQE) and allows users to do simple query-stage calculation through the expression.
- Deprecated metrics query v2 protocol.
- Deprecated record query protocol.
- Add component ID for go-redis.
- Add OpenSearch 2.8.0 to test case.
- Add
ai-pipeline
module. - Support HTTP URI formatting through
ai-pipeline
to do pattern recognition. - Add new HTTP URI grouping engine with benchmark.
- [Breaking Change] Use the new HTTP URI grouping engine to replace the old regex based mechanism.
- Support
sumLabeled
inMAL
. - Migrate from kubernetes-client/java to fabric8 client.
- Envoy ALS generated relation metrics considers http status codes >= 400 has an error at the client side.
- Add cause message field when query continuous profiling task.
UI
- Revert: cpm5d function. This feature is cancelled from backend.
- Fix: alerting link breaks on the topology.
- Refactor Topology widget to make it more hierarchical.
- Choose
User
as the first node. - If
User
node is absent, choose the busiest node(which has the most calls of all). - Do a left-to-right flow process.
- At the same level, list nodes from top to bottom in alphabetical order.
- Choose
- Fix filter ID when ReadRecords metric associates with trace.
- Add AWS API Gateway menu.
- Change trace profiling protocol.
- Add Redis menu.
- Optimize data types.
- Support isEmptyValue flag for metrics query.
- Add elasticsearch menu.
- [Clean UI templates before upgrade] Set
showSymbol: true
, and make the data point shows on the Line graph.
Please cleanui_template
index in elasticsearch storage or table in JDBC storage. - [Clean UI templates before upgrade] UI templates: Simplify metric name with the label.
- Add MQ menu.
- Add Jeysey icon.
- Fix: set endpoint and instance selectors with url parameters correctly.
- Bump up dependencies versions icons-vue 1.1.4, element-plus 2.1.0, nanoid 3.3.6, postcss 8.4.23
- Add OpenTelemetry log protocol support.
- [Breaking Change] Configuration key
enabledOtelRules
is renamed toenabledOtelMetricsRules
and
the corresponding environment variable is renamed toSW_OTEL_RECEIVER_ENABLED_OTEL_METRICS_RULES
. - Add grizzly icon.
- Fix: the Instance List data display error.
- Fix: set topN type to Number.
- Support Metrics Query Expression(MQE) and allows users to do simple query-stage calculation through the expression.
- Bump up zipkin ui dependency to 2.24.1.
- Bump up vite to 4.0.5.
- Apply MQE on
General
andVirtual-Database
layer UI-templates.
Documentation
- Add Profiling related documentations.
- Add
SUM_PER_MIN
to MAL documentation. - Make the log relative docs more clear, and easier for further more formats support.
- Update the cluster management and advanced deployment docs.
All issues and pull requests are here
9.4.0
Download
https://skywalking.apache.org/downloads/
Notice
Don't download source codes from this page.
Please follow build document, if you want to build source codes by yourself.
PromQL and Grafana Support
Zipkin Lens UI Bundled
AWS S3 and DynamoDB monitoring
Project
- Bump up Zipkin and Zipkin lens UI dependency to 2.24.0.
- Bump up Apache parent pom version to 29.
- Bump up Armeria version to 1.21.0.
- Clean up maven
pom.xml
s. - Bump up Java version to 11.
- Bump up snakeyaml to 2.0.
OAP Server
- Add
ServerStatusService
in the core module to provide a new way to expose booting status to other modules. - Adds Micrometer as a new component.(ID=141)
- Refactor session cache in MetricsPersistentWorker.
- Cache enhancement - don't read new metrics from database in minute dimensionality.
// When
// (1) the time bucket of the server's latest stability status is provided
// 1.1 the OAP has booted successfully
// 1.2 the current dimensionality is in minute.
// 1.3 the OAP cluster is rebalanced due to scaling
// (2) the metrics are from the time after the timeOfLatestStabilitySts
// (3) the metrics don't exist in the cache
// the kernel should NOT try to load it from the database.
//
// Notice, about condition (2),
// for the specific minute of booted successfully, the metrics are expected to load from database when
// it doesn't exist in the cache.
- Remove the offset of metric session timeout according to worker creation sequence.
- Correct
MetricsExtension
annotations declarations in manual entities. - Support component IDs' priority in process relation metrics.
- Remove abandon logic in MergableBufferedData, which caused unexpected no-update.
- Fix miss set
LastUpdateTimestamp
that caused the metrics session to expire. - Rename MAL rule
spring-sleuth.yaml
tospring-micrometer.yaml
. - Fix memory leak in Zipkin API.
- Remove the dependency of
refresh_interval
of ElasticSearch indices fromelasticsearch/flushInterval
config. Now,
it usescore/persistentPeriod
+ 5s asrefresh_interval
for all indices instead. - Change
elasticsearch/flushInterval
to 5s(was 15s). - Optimize
flushInterval
of ElasticSearch BulkProcessor to avoid extra periodical flush in the continuous bulk streams. - An unexpected dot is added when exp is a pure metric name and expPrefix != null.
- Support monitoring MariaDB.
- Remove measure/stream specific interval settings in BanyanDB.
- Add global-specific settings used to override global configurations (e.g
segmentIntervalDays
,blockIntervalHours
) in BanyanDB. - Use TTL-driven interval settings for the
measure-default
group in BanyanDB. - Fix wrong group of non time-relative metadata in BanyanDB.
- Refactor
StorageData#id
to the new StorageID object from a String type. - Support multiple component IDs in the service topology level.
- Add
ElasticSearch.Keyword
annotation to declare the target field type askeyword
. - [Breaking Change] Column
component_id
ofservice_relation_client_side
andservice_relation_server_side
have been replaced bycomponent_ids
. - Support
priority
definition in thecomponent-libraries.yml
. - Enhance service topology query. When there are multiple components detected from the server side,
the component type of the node would be determined by the priority, which was random in the previous release. - Remove
component_id
fromservice_instance_relation_client_side
andservice_instance_relation_server_side
. - Make the satellite E2E test more stable.
- Add Istio 1.16 to test matrix.
- Register ValueColumn as Tag for Record in BanyanDB storage plugin.
- Bump up Netty to 4.1.86.
- Remove unnecessary additional columns when storage is in logical sharding mode.
- The cluster coordinator support watch mechanism for notifying
RemoteClientManager
andServerStatusService
. - Fix ServiceMeshServiceDispatcher overwrite ServiceDispatcher debug file when open SW_OAL_ENGINE_DEBUG.
- Use
groupBy
andin
operators to optimize topology query for BanyanDB storage plugin. - Support server status watcher for
MetricsPersistentWorker
to check the metrics whether required initialization. - Fix the meter value are not correct when using
sumPerMinLabeld
orsumHistogramPercentile
MAL function. - Fix cannot display attached events when using Zipkin Lens UI query traces.
- Remove
time_bucket
for both Stream and Measure kinds in BanyanDB plugin. - Merge
TIME_BUCKET
ofMetrics
andRecord
intoStorageData
. - Support no
layer
in thelistServices
query. - Fix
time_bucket
ofServiceTraffic
not set correctly inslowSql
of MAL. - Correct the TopN record query DAO of BanyanDB.
- Tweak interval settings of BanyanDB.
- Support monitoring AWS Cloud EKS.
- Bump BanyanDB Java client to 0.3.0-rc1.
- Remove
id
tag from measures. - Add
Banyandb.MeasureField
to mark a column as a BanyanDB Measure field. - Add
BanyanDB.StoreIDTag
to store a process's id for searching. - [Breaking Change] The supported version of ShardingSphere-Proxy is upgraded from 5.1.2 to 5.3.1. Due to the changes of ShardingSphere's API, versions before 5.3.1 are not compatible.
- Add the eBPF network profiling E2E Test in the per storage.
- Fix TCP service instances are lack of instance properties like
pod
andnamespace
, which causes Pod log not to work for TCP workloads. - Add Python HBase happybase module component ID(94).
- Fix gRPC alarm cannot update settings from dynamic configuration source.
- Add
batchOfBytes
configuration to limit the size of bulk flush. - Add Python Websocket module component ID(7018).
- [Optional] Optimize single trace query performance by customizing routing in ElasticSearch. SkyWalking trace segments and Zipkin spans are using trace ID for routing. This is OFF by default, controlled by
storage/elasticsearch/enableCustomRouting
. - Enhance OAP HTTP server to support HTTPS
- Remove handler scan in otel receiver, manual initialization instead
- Add aws-firehose-receiver to support collecting AWS CloudWatch metric(OpenTelemetry format). Notice, no HTTPS/TLS setup
support. By following AWS Firehose request, it uses proxy request
(https://...
instead of/aws/firehose/metrics
), there must be a proxy(Nginx, Envoy, etc.). - Avoid Antlr dependencies' versions might be different in compile time and runtime.
- Now
PrometheusMetricConverter#escapedName
also support converting/
to_
. - Add missing TCP throughput metrics.
- Refactor
@Column
annotation, swapColumn#name
andElasticSearch.Column#columnAlias
and renameElasticSearch.Column#columnAlias
toElasticSearch.Column#legacyName
. - Add Python HTTPX module component ID(7019).
- Migrate tests from junit 4 to junit 5.
- Refactor http-based alarm plugins and extract common logic to
HttpAlarmCallback
. - Support Amazon Simple Storage Service (Amazon S3) metrics monitoring
- Support process Sum metrics with AGGREGATION_TEMPORALITY_DELTA case
- Support Amazon DynamoDB monitoring.
- Support prometheus HTTP API and promQL.
Scope
in the Entity of Metrics query v1 protocol is not required and automatical correction. The scope is determined based on the metric itself.- Add explicit
ReadTimeout
for ConsulConfigurationWatcher to avoidIllegalArgumentException: Cache watchInterval=10sec >= networkClientReadTimeout=10000ms
. - Fix
DurationUtils.getDurationPoints
exceed, whenstartTimeBucket
equalsendTimeBucket
. - Support process OpenTelemetry ExponentialHistogram metrics
- Add FreeRedis component ID(3018).
UI
- Add Zipkin Lens UI to webapp, and proxy it to context path
/zipkin
. - Migrate the build tool from vue cli to Vite4.
- Fix Instance Relation and Endpoint Relation dashboards show up.
- Add Micrometer icon.
- Update MySQL UI to support MariaDB.
- Add AWS menu for supporting AWS monitoring.
- Add missing FastAPI logo.
- Update the log details page to support the formatted display of JSON content.
- Fix build config.
- Avoid being unable to drag process nodes for the first time.
- Add node folder into ignore list.
- Add ElPopconfirm to component types.
- Add an iframe widget for zipkin UI.
- Optimize graph tooltips to make them more friendly.
- Bump json5 from 1.0.1 to 1.0.2.
- Add websockets icon.
- Implement independent mode for widgets.
- Bump http-cache-semantics from 4.1.0 to 4.1.1.
- Update menus for OpenFunction.
- Add auto fresh to widgets independent mode.
- Fix: clear trace ID on the Log and Trace widgets after using association.
- Fix: reset duration for query conditions after time range changes.
- Add AWS S3 menu.
- Refactor: optimize side bar component to make it more friendly.
- Fix: remove duplicate popup message for query result.
- Add logo for HTTPX.
- Refactor: optimize the attached events visualization in the trace widget.
- Update BanyanDB client to 0.3.1.
- Add AWS DynamoDB menu.
- Fix: add auto period to the independent mode for widgets.
- Optimize menus and add Windows monitoring menu.
- Add a calculation for the cpm5dAvg.
- add a cpm5d calculation.
- Fix data processing error in the eBPF profiling widget.
- Support for double quotes in SlowSQL statements.
- Fix: the wrong position of the menu when clicking the topology node.
##...
9.3.0
Download
https://skywalking.apache.org/downloads/
Notice
Don't download source codes from this page.
Please follow build document, if you want to build source codes by yourself.
Metrics Association
Dashboard | Pop-up Trace Query |
---|---|
APISIX Dashboard
Use Sharding MySQL as the Database
Virtual Cache Performance
Virtual MQ Performance
Project
- Bump up the embedded
swctl
version in OAP Docker image.
OAP Server
- Add component ID(133) for impala JDBC Java agent plugin and component ID(134) for impala server.
- Use prepareStatement in H2SQLExecutor#getByIDs.(No function change).
- Bump up snakeyaml to 1.32 for fixing CVE.
- Fix
DurationUtils.convertToTimeBucket
missed verify date format. - Enhance LAL to support converting LogData to DatabaseSlowStatement.
- [Breaking Change] Change the LAL script format(Add layer property).
- Adapt ElasticSearch 8.1+, migrate from removed APIs to recommended APIs.
- Support monitoring MySQL slow SQLs.
- Support analyzing cache related spans to provide metrics and slow commands for cache services from client side
- Optimize virtual database, fix dynamic config watcher NPE when default value is null
- Remove physical index existing check and keep template existing check only to avoid meaningless
retry wait
inno-init
mode. - Make sure instance list ordered in TTL processor to avoid TTL timer never runs.
- Support monitoring PostgreSQL slow SQLs.
- [Breaking Change] Support sharding MySQL database instances and tables by Shardingsphere-Proxy. SQL-Database requires removing tables
log_tag/segment_tag/zipkin_query
before OAP starts, if bump up from previous releases. - Fix meter functions
avgHistogram
,avgHistogramPercentile
,avgLabeled
,sumHistogram
having data conflict when
downsampling. - Do sorting
readLabeledMetricsValues
result forcedly in case the storage(database) doesn't return data consistent
with the parameter list. - Fix the wrong watch semantics in Kubernetes watchers, which causes heavy traffic to API server in some Kubernetes clusters, we should use
Get State and Start at Most Recent
semantic instead ofStart at Exact
because we don't need the changing history events, see https://kubernetes.io/docs/reference/using-api/api-concepts/#semantics-for-watch. - Unify query services and DAOs codes time range condition to
Duration
. - [Breaking Change]: Remove prometheus-fetcher plugin, please use OpenTelemetry to scrape Prometheus metrics and
set up SkyWalking OpenTelemetry receiver instead. - BugFix: histogram metrics sent to MAL should be treated as OpenTelemetry style, not Prometheus style:
(-infinity, explicit_bounds[i]] for i == 0 (explicit_bounds[i-1], explicit_bounds[i]] for 0 < i < size(explicit_bounds) (explicit_bounds[i-1], +infinity) for i == size(explicit_bounds)
- Support Golang runtime metrics analysis.
- Add APISIX metrics monitoring
- Support skywalking-client-js report empty
service version
andpage path
, set default version aslatest
and
default page path as/
(root). Fix the
errorfetching data (/browser_app_page_pv0) : Can't split endpoint id into 2 parts
. - [Breaking Change] Limit the max length of trace/log/alarm tag's
key=value
, set the max length of columntags
in tableslog_tag/segment_tag/alarm_record_tag
and columnquery
inzipkin_query
and columntag_value
intag_autocomplete
to 256.
SQL-Database requires altering these columns' length or removing these tables before OAP starts, if bump up from previous releases. - Optimize the creation conditions of profiling task.
- Lazy load the Kubernetes metadata and switch from event-driven to polling. Previously we set up watchers to watch the Kubernetes metadata changes, this is perfect when there are deployments changes and SkyWalking can react to the changes in real time. However when the cluster has many events (such as in large cluster or some special Kubernetes engine like OpenShift), the requests sent from SkyWalking becomes unpredictable, i.e. SkyWalking might send massive requests to Kubernetes API server, causing heavy load to the API server. This PR switches from the watcher mechanism to polling mechanism, SkyWalking polls the metadata in a specified interval, so that the requests sent to API server is predictable (~10 requests every
interval
, 3 minutes), and the requests count is constant regardless of the cluster's changes. However with this change SkyWalking can't react to the cluster changes in time, but the delay is acceptable in our case. - Optimize the query time of tasks in ProfileTaskCache.
- Fix metrics was put into wrong slot of the window in the alerting kernel.
- Support
sumPerMinLabeled
inMAL
. - Bump up jackson databind, snakeyaml, grpc dependencies.
- Support export
Trace
andLog
through Kafka. - Add new config initialization mechanism of module provider. This is a ModuleManager lib kernel level change.
- [Breaking Change] Support new records query protocol, rename the column named
service_id
toentity_id
for support difference entity.
Please re-createtop_n_database_statement
index/table. - Remove improper self-obs metrics in JvmMetricsHandler(for Kafka channel).
- gRPC stream canceling code is not logged as an error when the client cancels the stream. The client
cancels the stream when the pod is terminated. - [Breaking Change] Change the way of loading MAL rules(support pattern).
- Move k8s relative MAL files into
/otel-rules/k8s
. - [Breaking Change] Refactor service mesh protobuf definitions and split TCP-related metrics to individual definition.
- Add
TCP{Service,ServiceInstance,ServiceRelation,ServiceInstanceRelation}
sources and split TCP-related entities out from
originalService,ServiceInstance,ServiceRelation,ServiceInstanceRelation
. - [Breaking Change] TCP-related source names are changed, fields of TCP-related sources are changed, please refer to the latest
oal/tcp.oal
file. - Do not log error logs when failed to create ElasticSearch index because the index is created already.
- Add virtual MQ analysis for native traces.
- Support Python runtime metrics analysis.
- Support
sampledTrace
in LAL. - Support multiple rules with different names under the same layer of LAL script.
- (Optimization) Reduce the buffer size(queue) of MAL(only) metric streams. Set L1 queue size as 1/20, L2 queue size as 1/2.
- Support monitoring MySQL/PostgreSQL in the cluster mode.
- [Breaking Change] Migrate to BanyanDB v0.2.0.
- Adopt new OR logical operator for,
MeasureIDs
queryBanyanDBProfileThreadSnapshotQueryDAO
query- Multiple
Event
conditions query - Metrics query
- Simplify Group check and creation
- Partially apply
UITemplate
changes - Support
index_only
- Return
CompletableFuture<Void>
directly from BanyanDB client - Optimize data binary parse methods in *LogQueryDAO
- Support different indexType
- Support configuration for TTL and (block|segment) intervals
- Adopt new OR logical operator for,
- Elasticsearch storage: Provide system environment variable(
SW_STORAGE_ES_SPECIFIC_INDEX_SETTINGS
) and support specify the settings(number_of_shards/number_of_replicas)
for each index individually. - Elasticsearch storage: Support update index settings
(number_of_shards/number_of_replicas)
for the index template after rebooting. - Optimize MQ Topology analysis. Use entry span's peer from the consumer side as source service when no producer instrumentation(no cross-process reference).
- Refactor JDBC storage implementations to reuse logics.
- Fix
ClassCastException
inLoggingConfigWatcher
. - Support span attached event concept in Zipkin and SkyWalking trace query.
- Support span attached events on Zipkin lens UI.
- Force UTF-8 encoding in
JsonLogHandler
ofkafka-fetcher-plugin
. - Fix max length to 512 of entity, instance and endpoint IDs in trace, log, profiling, topN tables(JDBC storages). The value was 200 by default.
- Add component IDs(135, 136, 137) for EventMesh server and client-side plugins.
- Bump up Kafka client to 2.8.1 to fix CVE-2021-38153.
- Remove
lengthEnvVariable
forColumn
as it never works as expected. - Add
LongText
to support longer logs persistent as a text type in ElasticSearch, instead of a keyword, to avoid length limitation. - Fix wrong system variable name
SW_CORE_ENABLE_ENDPOINT_NAME_GROUPING_BY_OPENAPI
. It was opaenapi. - Fix not-time-series model blocking OAP boots in no-init mode.
- Fix
ShardingTopologyQueryDAO.loadServiceRelationsDetectedAtServerSide
invoke backend miss parameterserviceIds
. - Changed system variable
SW_SUPERDATASET_STORAGE_DAY_STEP
toSW_STORAGE_ES_SUPER_DATASET_DAY_STEP
to be consistent with other ES storage related variables. - Fix ESEventQueryDAO missing metric_table boolQuery criteria.
- Add default entity name(
_blank
) if absent ...
9.2.0
Download
https://skywalking.apache.org/downloads/
Notice
Don't download source codes from this page.
Please follow build document, if you want to build source codes by yourself.
eBPF Network Profiling for K8s Pod
Event and Metrics Association
MySQL Server Monitoring
PostgreSQL Server Monitoring
Project
- [Critical] Fix a low performance issue of metrics persistent in the ElasticSearch storage implementation. One single
metric could have to wait for an unnecessary 7~10s(System Env VariableSW_STORAGE_ES_FLUSH_INTERVAL
) since 8.8.0 -
9.1.0 releases. - Upgrade Armeria to 1.16.0, Kubernetes Java client to 15.0.1.
OAP Server
- Add more entities for Zipkin to improve performance.
- ElasticSearch: scroll id should be updated when scrolling as it may change.
- Mesh: fix only last rule works when multiple rules are defined in metadata-service-mapping.yaml.
- Support sending alarm messages to PagerDuty.
- Support Zipkin kafka collector.
- Add
VIRTUAL
detect type to Process for Network Profiling. - Add component ID(128) for Java Hutool plugin.
- Add Zipkin query exception handler, response error message for illegal arguments.
- Fix a NullPointerException in the endpoint analysis, which would cause missing MQ-related
LocalSpan
in the trace. - Add
forEach
,processRelation
function to MAL expression. - Add
expPrefix
,initExp
in MAL config. - Add component ID(7015) for Python Bottle plugin.
- Remove legacy OAL
percentile
functions,p99
,p95
,p90
,p75
,p50
func(s). - Revert #8066. Keep all metrics persistent even it is default value.
- Skip loading UI templates if folder is empty or doesn't exist.
- Optimize ElasticSearch query performance by using
_mGet
and physical index name rather than alias in these
scenarios, (a) Metrics aggregation (b) Zipkin query (c) Metrics query (d) Log query - Support the
NETWORK
type of eBPF Profiling task. - Support
sumHistogram
inMAL
. - [Breaking Change] Make the eBPF Profiling task support to the service instance level,
index/tableebpf_profiling_task
is required to be re-created when bump up from previous releases. - Fix race condition in Banyandb storage
- Support
SUM_PER_MIN
downsampling inMAL
. - Support
sumHistogramPercentile
inMAL
. - Add
VIRTUAL_CACHE
to Layer, to fix conjectured Redis server, which icon can't show on the topology. - [Breaking Change] Elasticsearch storage merge all metrics/meter and records(without super datasets) indices into one
physical index templatemetrics-all
andrecords-all
on the default setting.
Provide system environment variable(SW_STORAGE_ES_LOGIC_SHARDING
) to shard metrics/meter indices into
multi-physical indices as the previous versions(one index template per metric/meter aggregation function).
In the current one index mode, users still could choose to adjust ElasticSearch's shard
number(SW_STORAGE_ES_INDEX_SHARDS_NUMBER
) to scale out.
More details please refer to New ElasticSearch storage option explanation in 9.2.0
and backend-storage doc - [Breaking Change] Index/table
ebpf_profiling_schedule
added a new columnebpf_profiling_schedule_id
,
the H2/Mysql/Tidb/Postgres storage users are required to re-created it when bump up from previous releases. - Fix Zipkin trace query the max size of spans.
- Add
tls
andhttps
component IDs for Network Profiling. - Support Elasticsearch column alias for the compatibility between storage logicSharding model and no-logicSharding model.
- Support MySQL monitoring.
- Support PostgreSQL monitoring.
- Fix query services by serviceId error when Elasticsearch storage
SW_STORAGE_ES_QUERY_MAX_SIZE
> 10000. - Support sending alarm messages to Discord.
- Fix query history process data failure.
- Optimize TTL mechanism for Elasticsearch storage, skip executed indices in one TTL rotation.
- Add Kubernetes support module to share codes between modules and reduce calls to Kubernetes API server.
- Bump up Kubernetes Java client to fix cve.
- Adapt OpenTelemetry native metrics protocol.
- [Breaking Change] rename configuration folder from
otel-oc-rules
tootel-rules
. - [Breaking Change] rename configuration field from
enabledOcRules
toenabledOtelRules
and
environment variable name fromSW_OTEL_RECEIVER_ENABLED_OC_RULES
toSW_OTEL_RECEIVER_ENABLED_OTEL_RULES
. - [Breaking Change] Fix JDBC TTL to delete additional tables data.
SQL Database requires removingsegment
,segment_tag
,logs
,logs_tag
,alarms
,alarms_tag
,zipkin_span
,zipkin_query
before OAP starts. - SQL Database: add
@SQLDatabase.ExtraColumn4AdditionalEntity
to support add an extra column from parent to an additional table. - Add component ID(131) for Java Micronaut plugin
- Add component ID(132) for Nats java client plugin
UI
- Fix query conditions for the browser logs.
- Implement a url parameter to activate tab index.
- Fix clear interval fail when switch autoRefresh to off.
- Optimize log tables.
- Fix log detail pop-up page doesn't work.
- Optimize table widget to hide the whole metric column when no metric is set.
- Implement the Event widget. Remove
event
menu. - Fix span detail text overlap.
- Add Python Bottle Plugin Logo.
- Implement an association between widgets(line, bar, area graphs) with time.
- Fix tag dropdown style.
- Hide the copy button when db.statement is empty.
- Fix legend metrics for topology.
- Dashboard: Add metrics association.
- Dashboard: Fix
FaaS-Root
document link and topology service relation dashboard link. - Dashboard: Fix
Mesh-Instance
metricThroughput
. - Dashboard: Fix
Mesh-Service-Relation
metricThroughput
andProxy Sidecar Internal Latency in Nanoseconds (Client Response)
. - Dashboard: Fix
Mesh-Instance-Relation
metricThroughput
. - Enhance associations for the Event widget.
- Add event widgets in dashboard where applicable.
- Fix dashboard list search box not work.
- Fix short time range.
- Fix event widget incompatibility in Safari.
- Refactor the tags component to support searching for tag keys and values.
- Implement the log widget and the trace widget associate with each other, remove log tables on the trace widget.
- Add log widget to general service root.
- Associate the event widget with the trace and log widget.
- Add the MYSQL layer and update layer routers.
- Fix query order for trace list.
- Add a calculation to convert seconds to days.
q* Add Spring Sleuth dashboard to general service instance. - Support the process dashboard and create the time range text widget.
- Fix picking calendar with a wrong time range and setting a unique value for dashboard grid key.
- Add PostgreSQL to Database sub-menu.
- Implement the network profiling widget.
- Add Micronaut icon for Java plugin.
- Add Nats icon for Java plugin.
- Bump moment and @vue/cli-plugin-e2e-cypress.
- Add Network Profiling for Service Mesh DP instance and K8s pod panels.
Documentation
- Fix invalid links in release docs.
- Clean up doc about event metrics.
- Add a table for metric calculations in the ui doc.
- Add an explanation for alerting kernel and its in-memory window mechanism.
- Add more docs for widget details.
- Update alarm doc introduce configuration property key
- Fix dependency license's NOTICE and binary jar included issues in the source release.
- Add eBPF CPU profiling doc.
All issues and pull requests are here
9.1.0
Download
https://skywalking.apache.org/downloads/
Notice
Don't download source codes from this page.
Please follow build document, if you want to build source codes by yourself.
eBPF profiling
On-demand Pod Log
Project
- [IMPORTANT] Remove InfluxDB 1.x and Apache IoTDB 0.X as storage options, check details
at here. Remove converter-moshi 2.5.0, influx-java 2.15,
iotdb java 0.12.5, thrift 0.14.1, moshi 1.5.0, msgpack 0.8.16 dependencies. Remove InfluxDB and IoTDB relative codes
and E2E tests. - Upgrade OAP dependencies zipkin to 2.23.16, H2 to 2.1.212, Apache Freemarker to 2.3.31, gRPC-java 1.46.0, netty to
4.1.76. - Upgrade Webapp dependencies, spring-cloud-dependencies to 2021.0.2, logback-classic to 1.2.11
- [IMPORTANT] Add BanyanDB storage implementation. Notice BanyanDB is currently under active development
and SHOULD NOT be used in production cluster.
OAP Server
- Add component definition(ID=127) for
Apache ShenYu (incubating)
. - Fix Zipkin receiver: Decode spans error, missing
Layer
for V9 and wrong time bucket for generate Service and
Endpoint. - [Refactor] Move SQLDatabase(H2/MySQL/PostgreSQL), ElasticSearch and BanyanDB specific configurations out of column.
- Support BanyanDB global index for entities. Log and Segment record entities declare this new feature.
- Remove unnecessary analyzer settings in columns of templates. Many were added due to analyzer's default value.
- Simplify the Kafka Fetch configuration in cluster mode.
- [Breaking Change] Update the eBPF Profiling task to the service level, please delete
index/table:ebpf_profiling_task
,process_traffic
. - Fix event can't split service ID into 2 parts.
- Fix OAP Self-Observability metric
GC Time
calculation. - Set
SW_QUERY_MAX_QUERY_COMPLEXITY
default value to1000
- Webapp module (for UI) enabled compression.
- [Breaking Change] Add layer field to event, report an event without layer is not allowed.
- Fix ES flush thread stops when flush schedule task throws exception, such as ElasticSearch flush failed.
- Fix ES BulkProcessor in BatchProcessEsDAO was initialized multiple times and created multiple ES flush schedule tasks.
- HTTPServer support the handler register with allowed HTTP methods.
- [Critical] Revert Enhance DataCarrier#MultipleChannelsConsumer to add
priority to avoid consuming issues. - Fix the problem that some configurations (such as group.id) did not take effect due to the override order when using
the kafkaConsumerConfig property to extend the configuration in Kafka Fetcher. - Remove build time from the OAP version.
- Add data-generator module to run OAP in testing mode, generating mock data for testing.
- Support receive Kubernetes processes from gRPC protocol.
- Fix the problem that es index(TimeSeriesTable, eg. endpoint_traffic, alarm_record) didn't create even after rerun with
init-mode. This problem caused the OAP server to fail to start when the OAP server was down for more than a day. - Support autocomplete tags in traces query.
- [Breaking Change] Replace all configurations
**_JETTY_**
to**_REST_**
. - Add the support eBPF profiling field into the process entity.
- E2E: fix log test miss verify LAL and metrics.
- Enhance Converter mechanism in kernel level to make BanyanDB native feature more effective.
- Add TermsAggregation properties collect_mode and execution_hint.
- Add "execution_hint": "map", "collect_mode": "breadth_first" for aggregation and topology query to improve 5-10x
performance. - Clean up scroll contexts after used.
- Support autocomplete tags in logs query.
- Enhance Deprecated MetricQuery(v1) getValues querying to asynchronous concurrency query
- Fix the pod match error when the service has multiple selector in kubernetes environment.
- VM monitoring adapts the 0.50.0 of the
opentelemetry-collector
. - Add Envoy internal cost metrics.
- Remove
Layer
concept fromServiceInstance
. - Remove unnecessary
onCompleted
on gRPConError
callback. - Remove
Layer
concept formProcess
. - Update to list all eBPF profiling schedulers without duration.
- Storage(ElasticSearch): add search options to tolerate inexisting indices.
- Fix the problem that
MQ
has the wrongLayer
type. - Fix NoneStream model has wrong downsampling(was Second, should be Minute).
- SQL Database: provide
@SQLDatabase.AdditionalEntity
to support create additional tables from a model. - [Breaking Change] SQL Database: remove SQL Database config
maxSizeOfArrayColumn
andnumOfSearchableValuesPerTag
. - [Breaking Change] SQL Database: move
Tags list
fromSegment
,Logs
,Alarms
to their additional table. - [Breaking Change] Remove
total
field in Trace, Log, Event, Browser log, and alarm list query. - Support
OFF_CPU
eBPF Profiling. - Fix SumAggregationBuilder#build should use the SumAggregation rather than MaxAggregation.
- Add TiDB, OpenSearch, Postgres storage optional to Trace and eBPF Profiling E2E testing.
- Add OFF CPU eBPF Profiling E2E Testing.
- Fix searchableTag as
rpc.status_code
andhttp.status_code
.status_code
had been removed. - Fix scroll query failure exception.
- Add
profileDataQueryBatchSize
config in Elasticsearch Storage. - Add APIs to query Pod log on demand.
- Remove OAL for events.
- Simplify the format index name logical in ES storage.
- Add instance properties extractor in MAL.
- Support Zipkin traces collect and zipkin traces query API.
- [Breaking Change] Zipkin receiver mechanism changes and traces do not stream into OAP Segment anymore.
UI
- General service instance: move
Thread Pool
from JVM to Overview, fixJVM GC Count
calculation. - Add Apache ShenYu (incubating) component LOGO.
- Show more metrics on service/instance/endpoint list on the dashboards.
- Support average values of metrics on the service/list/endpoint table widgets, with pop-up linear graph.
- Fix viewLogs button query no data.
- Fix UTC when page loads.
- Implement the eBPF profile widget on dashboard.
- Optimize the trace widget.
- Avoid invalid query for topology metrics.
- Add the alarm and log tag tips.
- Fix spans details and task logs.
- Verify query params to avoid invalid queries.
- Mobile terminal adaptation.
- Fix: set dropdown for the Tab widget, init instance/endpoint relation selectors, update sankey graph.
- Add eBPF Profiling widget into General service, Service Mesh and Kubernetes tabs.
- Fix jump to endpoint-relation dashboard template.
- Fix set graph options.
- Remove the
Layer
filed from the Instance and Process. - Fix date time picker display when set hour to
0
. - Implement tags auto-complete for Trace and Log.
- Support multiple trees for the flame graph.
- Fix the page doesn't need to be re-rendered when the url changes.
- Remove unexpected data for exporting dashboards.
- Fix duration time.
- Remove the total field from query conditions.
- Fix minDuration and maxDuration for the trace filter.
- Add Log configuration for the browser templates.
- Fix query conditions for the browser logs.
- Add Spanish Translation.
- Visualize the OFF CPU eBPF profiling.
- Add Spanish language to UI.
- Sort spans with startTime or spanId in a segment.
- Visualize a on-demand log widget.
- Fix activate the correct tab index after renaming a Tabs name.
- FaaS dashboard support on-demand log (OpenFunction/functions-framework-go version > 0.3.0).
Documentation
- Add eBPF agent into probe introduction.
All issues and pull requests are here