Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DNSEssentials detection not working as they should #11663

Open
tbethe opened this issue Jan 14, 2025 · 6 comments
Open

DNSEssentials detection not working as they should #11663

tbethe opened this issue Jan 14, 2025 · 6 comments
Assignees
Labels
Analytic Rules question Further information is requested

Comments

@tbethe
Copy link

tbethe commented Jan 14, 2025

Describe the bug
I believe most of the DNSEssentials detection are broken, but it's quite fundamental and has been there for so long, I fear I am simply not understanding what is going on here. It really seems wrong though. I would like for someone to confirm that this is indeed wrong or explain to me why these detections are indeed correct.

Some contexts of how I understand DNS logging:

As far as I understand, or at least the logging that I am working with, the device that does the logging are the DNS server on the network. They log the queries/responses they get when querying on behalf of clients on the internal network and they log the queries/responses they send back to the clients.

For example, client A queries example.com to resolver X. Resolver X resolves it with some public resolver and relays the response back to client A. In this case you'd get at least 4 entries in the DNS table.

  • Request from client A to resolver X
  • Request from resolver X to outside
  • Response from outside to resolver X
  • Response resolver X to client A.

What the detections look for:

Let's take the 'Excessive anomaly based NXDOMAIN DNS queries detected' as an example.

The description states the intent of this detection: This rule uses anomaly decompose to generate an alert when a client requests an excessive number of DNS queries that return NXDOMAIN (meaning the domain does not exist). This helps identifying C2 communications.

However, the implementation is as follows: The detection looks for logs of type NXDOMAIN (these will always be responses, never queries) and summarized on SrcIp. This is the crux. Should this not be DstIp?

In responses, the SrcIp will always be the resolver! The client will never send responses back to the resolver. An organisation will have just a handful of DNS resolvers at most, so in reality we are detecting if the whole organization has excessive DNS queries which return NXDOMAIN. This is not the purpose of the detection at all.

If we summarized on DstIp, we could monitor if an individual client is making excessive queries which return NXDOMAIN, which is the intent of the detection, following the description.

The other detections ('Identifying DGA via DNS failures' and 'Failures from multiple clients') have the same flaw. I did not check the 'Rare static threshold based client observed with high reverse DNS lookup count' detection yet.

There are other issues with these detections, such as some detections completely ignoring everything that is not rDNS and not filtering for rDNS queries that refer to local IPs, such as 10.0.x.x ranges etc.

Thanks in advance to anymore looking into this.

@v-sudkharat v-sudkharat self-assigned this Jan 15, 2025
@v-sudkharat v-sudkharat added Analytic Rules question Further information is requested labels Jan 15, 2025
@v-visodadasi
Copy link
Contributor

Hi @tbethe , Thanks for flagging this issue, we will investigate this issue and get back to you with some updates. Thanks!

@tbethe
Copy link
Author

tbethe commented Jan 28, 2025

@v-visodadasi Any update on this?

@v-visodadasi
Copy link
Contributor

@tbethe, We are actively working on this issue and will get back to you soon with an update!

@v-visodadasi
Copy link
Contributor

@tbethe ,

I tested the detection rule by summarizing on both DstIpAddr and SrcIpAddr, and I am getting the same output logs for both. Could you please provide more details or share relevant logs or Screenshots to help us understand the issue better?

Thank you!

@tbethe
Copy link
Author

tbethe commented Feb 17, 2025

That is certainly surprising. Can you tell me what the data that you are querying from looks like? Is it also generated by the DNS servers? I.e., is the DNS logging you have the same as mine? Generated from the perspective of the DNS servers, meaning that hosts will never send DNS responses, only requests?

I'll also note that all queries (except maybe the "Rare client obseverved with high reverse DNS lookup count") have this problem. Not just the one I used as an example.

Sharing logs is difficult, due to them being private, but I can share some meta-data. For example:

Summarizing on:

  • SrcIpAddr -> 2 hosts as results. This makes sense, since this environment only has a few DNS servers, which are the only hosts that can be a result.
  • DstIpAddr -> c.a. 400 hosts as results.

The query where we summarize on DstIpAddr:

let threshold = 2.5;
      let min_t = ago(14d);
      let max_t = now();
      let dt = 1d;
      let summarizationexist = (
        union isfuzzy=true
            (
            DNS_Summarized_Logs_ip_CL
            | where EventTime_t > ago(1d)
            | project v = int(2)
            ),
            (
            print int(1)
            | project v = print_0
            )
        | summarize maxv = max(v)
        | extend sumexist = (maxv > 1)
        );
      let allData = union isfuzzy=true
            (
            (datatable(exists: int, sumexist: bool)[1, false]
            | join (summarizationexist) on sumexist)
            | join (
                _Im_Dns(responsecodename='NXDOMAIN', starttime=todatetime(min_t), endtime=max_t)
                | summarize Count=count() by DstIpAddr, bin(TimeGenerated, 1h)
                | extend EventTime = TimeGenerated, Count = toint(Count), exists=int(1)
                )
                on exists
            | project-away exists, maxv, sum*
            ),
            (
            DNS_Summarized_Logs_ip_CL
            | where EventTime_t > min_t and EventResultDetails_s == 'NXDOMAIN'
            | summarize Count=toint(sum(count__d)) by SrcIpAddr=SrcIpAddr_s, bin(EventTime=EventTime_t, 1h)
            );
      allData
      | make-series EventCount=sum(Count) on EventTime from min_t to max_t step dt by DstIpAddr
      | extend (anomalies, score, baseline) = series_decompose_anomalies(EventCount, threshold, -1, 'linefit')
      | mv-expand anomalies, score, baseline, EventTime, EventCount
      | extend
        anomalies = toint(anomalies),
        score = toint(score),
        baseline = toint(baseline),
        EventTime = todatetime(EventTime),
        Total = tolong(EventCount)
      | where EventTime >= ago(dt)
    //   | where score >= threshold * 2 // COMMENTED OUT FOR TESTING
      | join kind=inner(_Im_Dns(responsecodename='NXDOMAIN', starttime=ago(dt), endtime=max_t)
        | summarize DNSQueries = make_set(DnsQuery) by DstIpAddr)
        on DstIpAddr
    //   | project-away SrcIpAddr1

@v-visodadasi
Copy link
Contributor

@tbethe ,
Thank you for your response. I apologize for the confusion. The data I initially ingested had an equal number of SrcIpAddr and DstIpAddr, which does not accurately reflect the typical DNS logging scenario.

In most network environments, SrcIpAddr hosts (representing DNS servers) are fewer compared to DstIpAddr hosts (representing clients or hosts making DNS queries). This is because DNS servers are limited in number, while there are many clients or hosts generating DNS queries.

I understand that this discrepancy has caused issues with the queries. We are reaching out to the concerned team to discuss with them to accurately represents the typical DNS logging environment. Once we receive an update, we will inform you.

Thank you for your patience and understanding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Analytic Rules question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants