-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement TPC-H data loader #97
base: main
Are you sure you want to change the base?
Conversation
8a3e3dd
to
be66704
Compare
…ling-simulator into tpch-loader
# # Add the TaskGraphRelease events into the system. | ||
# for task_graph_name, task_graph in self._workload.task_graphs.items(): | ||
# event = Event( | ||
# event_type=EventType.TASK_GRAPH_RELEASE, | ||
# time=task_graph.release_time, | ||
# task_graph=task_graph_name, | ||
# ) | ||
# self._event_queue.add_event(event) | ||
# self._logger.info( | ||
# "[%s] Added %s to the event queue.", | ||
# self._simulator_time.to(EventTime.Unit.US).time, | ||
# event, | ||
# ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should still exist in the Simulator, right?
# TODO: make configurable | ||
TPCH_SUBDIR = "100g/" | ||
DECIMA_TPCH_DIR = ( | ||
"/home/dgarg39/erdos-scheduling-simulator/profiles/workload/tpch/decima/" | ||
) | ||
CLOUDLAB_TPCH_DIR = ( | ||
"/home/dgarg39/erdos-scheduling-simulator/profiles/workload/tpch/cloudlab/" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you put this in the flags please since this won't work across other machines?
class SetWithCount(object): | ||
""" | ||
allow duplication in set | ||
""" | ||
|
||
def __init__(self): | ||
self.set = {} | ||
|
||
def __contains__(self, item): | ||
return item in self.set | ||
|
||
def add(self, item): | ||
if item in self.set: | ||
self.set[item] += 1 | ||
else: | ||
self.set[item] = 1 | ||
|
||
def clear(self): | ||
self.set.clear() | ||
|
||
def remove(self, item): | ||
self.set[item] -= 1 | ||
if self.set[item] == 0: | ||
del self.set[item] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this doing anything different from collections.Counter
?
This patch implements a data loader for TPC-H workloads.
The loader is modeled after
data/alibaba_loader.py
.The simulator does not correctly handle the addition of
TASK_GRAPH_RELEASE
events when the workload is mutated by the loader. The event handler forTASK_GRAPH_RELEASE
is only used for logging, so I just commented out the code that adds the event to the queue, punting a proper fix for later.