Skip to content

how to interpolate smooth distributions? #28

@priamai

Description

@priamai

Hi there,
I guess I may have already hit a limitation with the library.
Any help would be great, maybe I have to move to a more complex solution.
Anyway here's my issue:

def example_learning():

    import pandas as pd

    samples = pd.DataFrame({"Host":["carl","ermano","jon"],
                       "Detection":["PsExec","PsExec","PsExec"],
                        "Outcome":["TP","FP"],
                       "HourOfDay":[5,10,13]})
    print(samples)

    structure = hh.structure.chow_liu(samples)

    bn = hh.BayesNet(*structure)
    bn = bn.fit(samples)
    bn.prepare()
    '''
    dot = bn.graphviz()

    path = dot.render('asia', directory='figures', format='svg', cleanup=True)
    '''
    print("Probability of detection")
    print(bn.P["Detection"])

    print("Probability of outcome")
    print(bn.P["Outcome"])
    print("Probability of FP at 5 am")
    event = {"Host":"carl","Detection":"PsExec","HourOfDay":5}
    bn.predict_proba(event)
    print("Probability of FP at 6 am")
    # this will fail because is unseen: how do we generalize?
    event = {"Host":"carl","Detection":"PsExec","HourOfDay":6}
    bn.predict_proba(event)

I want to predict the probably of a false positive at 6 am which was not observed in the training set.
I am not sure what is the correct approach here is there a way to assign a smooth distribution across the 24hours so that it will assign a tiny probability that is unobserved?

How other libraries like Pomegrenade handle this kind of situations?
Cheers!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions