Skip to content
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,15 @@
# Changelog

## UNRELEASED

### Adds

* Apostrophe now automatically adds the appropriate default values for new properties in the schema, even for existing documents in the database. This is done automatically during the migration phase of startup.

### Fixes

* Apostrophe's migration logic is no longer executed twice on every startup and three times in the migration task. It is executed exactly once, always at the same point in the startup process. This bug did not cause significant performance issues because migrations were only executed once, but there is a small performance improvement.

## 4.7.0 (2024-09-05)

### Adds
Expand Down
2 changes: 1 addition & 1 deletion index.js
Original file line number Diff line number Diff line change
Expand Up @@ -300,7 +300,7 @@ async function apostrophe(options, telemetry, rootSpan) {
self.apos.schema.validateAllSchemas();
self.apos.schema.registerAllSchemas();
await self.apos.lock.withLock('@apostrophecms/migration:migrate', async () => {
await self.apos.migration.migrate(); // emits before and after events, inside the lock
await self.apos.migration.migrate(self.apos.argv);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency as this was passed in the other, legacy invocation. It is not currently used.

// Inserts the global doc in the default locale if it does not exist; same for other
// singleton piece types registered by other modules
for (const module of Object.values(self.modules)) {
Expand Down
2 changes: 1 addition & 1 deletion lib/moog.js
Original file line number Diff line number Diff line change
Expand Up @@ -170,7 +170,7 @@ module.exports = function(options) {
}
for (const key of Object.keys(step)) {
if (!(validKeys.includes(key) || cascades.includes(key))) {
const message = upgradeHints[key] || `${key} is not a valid top level property for an Apostrophe 3.x module. Make sure you nest regular module options in the new "options" property.`;
const message = upgradeHints[key] || `${key} is not a valid top level property for an Apostrophe module. Make sure you nest regular module options in the "options" property.`;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated language.

throw `${clarifyModuleName(step.__meta.name)}: ${message}`;
}
}
Expand Down
32 changes: 19 additions & 13 deletions modules/@apostrophecms/migration/index.js
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
const broadband = require('broadband');
const _ = require('lodash');

// Provide services for database migration. The `@apostrophecms/migration:migrate` task
// carries out all migrations that have been registered with this module. Migrations
// are used to make changes to the database at the time of a new code deployment,
Expand Down Expand Up @@ -38,15 +39,11 @@ module.exports = {
manager.addSortifyMigration(field.name);
});
});
},
async executeMigrations() {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

D'oh. This was completely redundant in the current setup.

if ((process.env.NODE_ENV !== 'production') || self.apos.isNew) {
// Run migrations at dev startup (low friction).
// Also always run migrations at first startup, so even
// in prod with a brand new database the after event always fires
// and we get a chance to mark the migrations as skipped
await self.migrate(self.apos.argv);
}
}
},
before: {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Invoke the new capability, which is not a registered, named migration, but rather an independent part of the migration phase that does its own checks to determine if work is needed.

async addMissingSchemaFields() {
await self.addMissingSchemaFields();
}
}
};
Expand Down Expand Up @@ -245,8 +242,13 @@ module.exports = {
// Just in case the db has no documents but did
// start to run migrations on a previous attempt,
// which causes an occasional unique key error if not
// corrected for here
await self.db.removeMany({});
// corrected for here.
//
// Other migration-related facts that are not migration
// names are stored with a leading *, leave them alone
await self.db.removeMany({
_id: /^[^*]/
Copy link
Member Author

@boutell boutell Sep 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somewhere to keep memos like the previous list of existing schema fields. Otherwise we have a proliferation of tiny mongo collections.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could have just one new, separate collection like aposModuleSchema or aposRegisteredSchema so we don't store it into the aposMigration collection, which is a bit tied to migrations, but not really...

Inside that dedicated collection, we could have the schema stored in documents: 1 document in that collection = 1 module schema.

That would require changing a lot of code, and since you've already finished the feature, maybe it's not worth it.
Just wanted to have your thoughts on that, to have the schemas stored in a dedicated collection for clarity

Copy link
Member Author

@boutell boutell Sep 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather not proliferate collections for very small amounts of data unless necessary as they can add open file descriptors in MongoDB.

});
await self.db.insertMany(self.migrations.map(migration => ({
_id: migration.name,
at,
Expand Down Expand Up @@ -286,14 +288,18 @@ module.exports = {
throw err;
}
}
}
},
...require('./lib/addMissingSchemaFields.js')(self)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we put the require at the top to make the dependency clear

const addMissingSchemaFields = require('./lib/addMissingSchemaFields.js');

and then

Suggested change
...require('./lib/addMissingSchemaFields.js')(self)
...addMissingSchemaFields(self)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

};
},
tasks(self) {
return {
migrate: {
usage: 'Apply any necessary migrations to the database.',
task: self.migrate
// Migrations actually run on every invocation
// and automatically detect whether any work
// must be done
task: () => {}
Copy link
Member Author

@boutell boutell Sep 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously this task invoked the migrate code three times in dev: once in the actual, intended migration phase of startup (see index.js), once in an "apostrophe:ready" handler, and once here. Being thorough I guess geez past Tom

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we remove the entire tasks block?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The task must exist otherwise users will get an error when launching the migration task according to our documentation. The fact that it is implicit is an implementation detail that should not concern users. Later we may eliminate the implicit invocation except when the task is being invoked, in production.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means that if I manually run node app.js @apostrophecms/migration:migrate nothing will happen now.
I might want to run the migration without starting apostrophe server, or start apostrophe without the migration.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@haroun it has worked like this for about as long as A3 has existed. Plus extra, redundant invocations in two other places. Since we have this automatic invocation at every startup I am fixing the bug of redundant invocations. As I mentioned we might reconsider this behavior later for production, but in dev it never makes sense to start up without migrating, if your code does not behave after migrations then that is a bug in the migrations that needs attention and something that will break in production if bypassed, etc.

}
};
}
Expand Down
141 changes: 141 additions & 0 deletions modules/@apostrophecms/migration/lib/addMissingSchemaFields.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
const _ = require('lodash');
const { klona } = require('klona');

module.exports = (self) => {
return {
async addMissingSchemaFields() {
let scans = 0; let updates = 0;
const lastPropLists = await self.getLastPropLists();
Copy link
Member Author

@boutell boutell Sep 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A "prop list" is a list of all schema properties that existed for a doc type last time, for comparison with the new list for this startup to determine if any property additions might potentially be needed. We make this determination quickly and move on if new work is not needed.

const propLists = self.getPropLists();
let changesToPropLists = false;
for (const name of Object.keys(propLists)) {
if (!_.isEqual(lastPropLists?.[name], propLists[name])) {
changesToPropLists = true;
scans++;
updates += await self.addMissingSchemaFieldsForDocType(name, propLists[name]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method accepts only 1 parameter below

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The migration phase will run on any startup. Including an empty task. So omitting the call from the task is just eliminating redundancy.

At a future time we might reintroduce an optional behavior where the migration does not run in production except in the task, but that behavior has effectively not been possible before, so I don't have to introduce that feature to fix the redundancy bug.

}
}
if (changesToPropLists) {
await self.updateLastPropLists(propLists);
}
// Returned for tests
return {
scans,
updates
};
},
async addMissingSchemaFieldsForDocType(name) {
let updates = 0;
const schema = (self.apos.doc.managers[name] || {}).schema;
if (!schema) {
return;
}
await self.eachDoc({
type: name
}, async doc => {
const changes = {};
await self.addMissingSchemaFieldsFor(doc, schema, '', changes);
if (Object.keys(changes).length > 0) {
updates++;
return self.apos.doc.db.updateOne({
_id: doc._id
}, {
$set: changes
});
}
});
return updates;
},
// Adds changes to the object "changes" so that a single
// $set call can be made at the end. Use of a single
// object passed by reference also avoids creating many
// unnecessary objects in memory during a time-sensitive
// operation
addMissingSchemaFieldsFor(doc, schema, dotPath, changes) {
for (const field of schema) {
const newDotPath = dotPath ? `${dotPath}.${field.name}` : field.name;
// Supply the default
if (doc[field.name] === undefined) {
// Only undefined should fall back here
const def = klona((field.def === undefined) ? self.apos.schema.fieldTypes[field.type]?.def : field.def);
if (def !== undefined) {
if (!Object.hasOwn(changes, dotPath)) {
changes[newDotPath] = def;
}
// Also change it in memory so that if this is a subproperty of a
// new object, the change for that new object will have this
// subproperty too, plus we don't get crashes above when testing the
// subproperties' current values
doc[field.name] = def;
}
}
// Address defaults of subproperties
if (field.type === 'area') {
const basePath = `${newDotPath}.items`;
for (let i = 0; (i < (doc[field.name]?.items || []).length); i++) {
const widgetPath = `${basePath}.${i}`;
const widget = doc[field.name].items[i];
const widgetSchema = self.apos.area.getWidgetManager(widget.type)?.schema;
if (!widgetSchema) {
continue;
}
self.addMissingSchemaFieldsFor(widget, widgetSchema, widgetPath, changes);
}
} else if (field.type === 'object') {
self.addMissingSchemaFieldsFor(doc[field.name], field.schema, newDotPath, changes);
} else if (field.type === 'array') {
for (let i = 0; (i < (doc[field.name] || []).length); i++) {
const itemPath = `${newDotPath}.${i}`;
const item = doc[field.name][i];
self.addMissingSchemaFieldsFor(item, field.schema, itemPath, changes);
}
} else if (field.type === 'relationship') {
for (const [ key, item ] of Object.entries(doc[field.fieldsStorage] || {})) {
const itemPath = `${newDotPath}.${field.fieldsStorage}.${key}`;
self.addMissingSchemaFieldsFor(item, field.schema, itemPath, changes);
}
}
}
return changes;
},
getPropLists() {
const schema = {};
for (const [ name, module ] of Object.entries(self.apos.doc.managers)) {
if (!module.__meta.name) {
// Just a placeholder for a type present in the
// database but not found in the project code
continue;
}
schema[name] = [];
self.expandPropList(module.schema, schema[name], '');
}
return schema;
},
expandPropList(schema, propList, dotPath) {
for (const field of schema) {
const newDotPath = dotPath ? `${dotPath}.${field.name}` : field.name;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're not coming up with a real dot path to patch anything at this stage, just a conceptual dot path that namespaces these properties enough to determine if there are differences from one generation of code to the next by comparing two lists of paths.

Real dot paths get figured out in the guts of addMissingSchemaFieldsFor, which does the hard work if a difference does exist.

propList.push(newDotPath);
if (field.schema) {
self.expandPropList(field.schema, propList, newDotPath);
}
}
},
async getLastPropLists() {
const result = await self.db.findOne({
_id: '*lastPropLists'
});
return result?.propLists;
},
async updateLastPropLists(propLists) {
return self.db.updateOne({
_id: '*lastPropLists'
}, {
$set: {
propLists
}
}, {
upsert: true
});
}
};
};
Loading