Dashboards

This describes the new approach to dashboards which will be implemented with #955.

Structure and exports

Each ActivityPackage can optionally export a dashboards object, which contains a set of dashboardName/dashboardT key/values.

Each dashboardT contains a mergeLog function and optionally a prepareDataForDisplay function, which will all be executed on the server, and a Viewer component, which will be shown in the teacher's browser. There is also definition of initial data (initData), and optionally exampleLogs.

Flow of data

logger

When an activity generates a log message with the function passed through the logger prop, it is annotated with extra data about the activityId, userId, etc, and stored in the database for future analysis. It is also sent to the mergeLog functions of all the associated dashboards of that activity.

mergeLog

The mergeLog function is called with the state, which is a mutable simple JavaScript Object, initialized as initData, and log which contains a single log message (as well as activity, to access the config, etc). The mergeLog function should directly mutate this object, through for example:

setting (state.x = 3)
deleting keys (delete state.x)
operating on arrays (push, splice, pop, shift, etc)

Note that Node is single-threaded, so there will never be any other modifying the value of state while a single mergeLog function is executing. Although mergeLog modifies the state directly, it should still be thought of as a "pure function" – given the same state, and the same log, mergeLog should always modify state in the same way - ie. given a certain log and a certain state, after running mergeLog(state, log), state should always be the same.

Also important to ensure that state is always serializable to JSON, thus avoiding for example circular references. (const a = {}; a.b = a is valid JS, but will not work for dashboard states). Also important to not reassign state, ie. not doing state = {},

Implementation detail

We could of course also work in a purely functional manner, where the function returns a new state, and does not modify a mutable variable. However, since our state might become quite large (which is no problem, it's only stored in memory), and might be modified very frequently, even by changing a single key out of thousands, this seems like it would have large performance issues. If the mutability ever becomes an issue, let's revisit looking at something like immer.js or Immutable.js to enhance performance.

Performance notes

As mentioned above, state is never written to the database, or synced across multiple servers (all log processing happens on a single designated server), or synced to the teacher's dashboard. Thus, it should be quite cheap to store all necessary contextual information in state. If the server restarts, it will automatically load all log messages from the database, and rerun mergeLog for all messages, to arrive at an updated state (only for currently open activities) - thus the necessity of mergeLog being "functionally pure", and not depending on for example Time.now().

prepareDataForDisplay

The dashboard can optionally provide a prepareDataForDisplay function, which receives the state (as well as activity, to access the config, etc), and returns the data which will be synced to the teacher's dashboard. The prepareDataForDisplay function will never be called more than once per second, and only when at least one teacher is actively viewing that specific dashboard. If there is no such function, the state itself will be synced to the teacher's browser, but again maximum once per second, not each time it's updated.

The state is cloned before being handed to prepareDisplay, so the function cannot have any influence on state outside of it's own context, only what it explicitly returns matters. This is different from mergeLog.

Viewer

The Viewer receives the state, as well as the normal properties, like activity, instances, config, etc. It is guaranteed that state will only update once per second.

Next activity

When an activity closes, the current state of state (if there is no prepareDataForDisplay), or the output of running prepareDataForDisplay with the last state, is stored in the database, and whenever a teacher or student wants to look at an old dashboard, the data is simply gotten from the database and displayed. This is to avoid keeping too many large states in memory when it is impossible for them to change anymore.

Multi-server setup

When running in development mode, no change to the current architecture is required. However, for a setup with multiple Meteor servers, one of them will be designated the log processor, and its URL/port will be given to the other Meteor servers (all through Meteor.settings). There will be a Redis queue set up between all the Meteor clients, and all incoming log messages will be written to the database, and then put on this queue for dashboard processing. The designated log processor will take one and one log item off the queue, process, update state, etc. The teacher's browser will set up a separate websocket to the log processor, and subscribe to the relevant collections with DDP. The same is the case for students viewing an ac-dashboard activity, unless the activity has completed (most common case for students), in which case the data is simply fetched from the database.

Performance discussions

We will need to experiment with where to distribute calculation and logic/inference etc, between the three stages mergeLog, prepareDataForDisplay, and the render function in Viewer. Most likely, mergeLog should be optimized for speed, and keeping enough data/context to be able to do different kinds of calculations in the future. prepareDataForDisplay should be optimized for length of output - for example, if a dashboard shows a dotplot with thousands of dots, which cannot all be perceived by the teacher, and it can be easily calculated which hundred points could visually represent these thousand dots, it would be better to do this in prepareDataForDisplay, before syncing with the client. However, if there is some slightly expensive calculation which does not change the length of the data, it might be better to do it in the client, to remove pressure on the server. We will experiment with this.

Future implications for operators

The idea of continually updating an in-memory model (state), and then having functions act on this model, could be an interesting model for live operators, and also for a new approach where "analytics plugins" (sentiment analysis, semantics etc) can feed both dashboard visualizations and social/product operators etc. This will be explored further.

Product-based dashboards

Currently, the model is based on student activity. It is of course possible to "reconstruct" the current state of students' products, by logging the current product on every change, and keeping an up-to-date representation in state - but for dashboards that focus on the current state of products, this seems unnecessary. Perhaps we could also have an optional function that subscribes to all the reactive documents of a given activity, and is called on every change, and has the option to update a productState, which could then also be accessible to the prepareDataForDisplay function. This way we could very easily implement something showing the current length of all student documents, etc. Could also be interesting for live operators, as discussed above. To be explored further.

Dashboards

Structure and exports

Flow of data

logger

mergeLog

Implementation detail

Performance notes

prepareDataForDisplay

Viewer

Next activity

Multi-server setup

Performance discussions

Future implications for operators

Product-based dashboards

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally