-
Notifications
You must be signed in to change notification settings - Fork 467
Add support for multiple Fate instances to the Manager #4133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This commit updates the Manager to create two fate instances, one for store FATE operations against tables in the accumulo namespace (meta) in the existing zookeeper store and another fate instance to store FATE operations against user tables (user). The thrift API has been modified so that when acquiring a new transaction, the type is returned of the instance used because future operations with the same transaction will need to know which instance was used.
The Fate Admin command needs to be updated now that multiple instance types are used for Fate operations. This commit fixes the ITs that call the admin command operations but there is still more work to be done to complete all the changes.
I started a full IT build to see what breaks after the changes. |
There were some failures, most notably NamespacesIT which all tests were broken due to the new table creation. I fixed that and will kick off a new build to get an updated list of tests that needed looking at to see if the rest are real failures or not. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes look good. Need to open a few follow on issues.
initializeFateInstance(context, FateInstanceType.META, | ||
new ZooStore<>(getZooKeeperRoot() + Constants.ZFATE, context.getZooReaderWriter())); | ||
initializeFateInstance(context, FateInstanceType.USER, | ||
new AccumuloStore<>(context, FateTable.NAME)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After these are setup, the map never changes. Could use an immutable map like the following.
initializeFateInstance(context, FateInstanceType.META, | |
new ZooStore<>(getZooKeeperRoot() + Constants.ZFATE, context.getZooReaderWriter())); | |
initializeFateInstance(context, FateInstanceType.USER, | |
new AccumuloStore<>(context, FateTable.NAME)); | |
var metaInstance = initializeFateInstance(context, FateInstanceType.META, | |
new ZooStore<>(getZooKeeperRoot() + Constants.ZFATE, context.getZooReaderWriter())); | |
var userInstance = initializeFateInstance(context, FateInstanceType.USER, | |
new AccumuloStore<>(context, FateTable.NAME)); | |
fateRefs = Map.of(FateInstanceType.META, metaInstance, FateInstanceType.USER, userInstance); |
and change the declaration of fateRefs to
private volatile Map<FateInstanceType,Fate<Manager>> fateRefs = Map.of();
One downside of this that variable can not be final anymore. An upside is its nice having the map be immutable when its never expected to change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@keith-turner - I went ahead and decided to do this but used an AtomicReference
instead of volatile. I went with that approach because I could make that reference final still and then ensure things are expected by adding some state checks when initializing and also retrieving the reference to make sure on initialization it's null and on retrieval it exists.
It's all private code so in theory we shouldn't need to check the expected state each time but I thought it might be good in case future changes causes those assumptions to change. Let me know what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@keith-turner - I went ahead and decided to do this but used an AtomicReference instead of volatile. I went with that approach because I could make that reference final still and then ensure things are expected by adding some state checks when initializing and also retrieving the reference to make sure on initialization it's null and on retrieval it exists.
That all looks good. The compare and set check that ensures its null when setting the map is nice, should help avoid future bugs where we create the fate instances multiple times.
server/manager/src/main/java/org/apache/accumulo/manager/Manager.java
Outdated
Show resolved
Hide resolved
|
||
goalMessage += "Offline table " + tableId; | ||
manager.fate().seedTransaction(op.toString(), opid, | ||
manager.fate(type).seedTransaction(op.toString(), tid, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alot of the case statements have validation at the beggining. We could add validation that checks the the actual table id and fate type are sensible. I think this would be best done in a follow on PR.
server/manager/src/main/java/org/apache/accumulo/manager/FateServiceHandler.java
Outdated
Show resolved
Hide resolved
} | ||
|
||
private long beginFateOperation() throws ThriftSecurityException, TException { | ||
private TFateId beginFateOperation(TFateInstanceType type) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was really surprised by how minimal the changes were to this class.
TFateInstanceType t = tableOrNamespaceName.startsWith(Namespace.ACCUMULO.name()) | ||
? TFateInstanceType.META : TFateInstanceType.USER; | ||
opid = beginFateOperation(t); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we add a toThrift()
method to FateInstanceType could do :
TFateInstanceType t = tableOrNamespaceName.startsWith(Namespace.ACCUMULO.name()) | |
? TFateInstanceType.META : TFateInstanceType.USER; | |
opid = beginFateOperation(t); | |
opid = beginFateOperation(FateInstanceType.toType(tableOrNamespaceName).toThrift()); |
Could do the following for the toThrift() method.
public enum FateInstanceType {
public TFateInstanceType toThrift(){
switch (this) {
case USER:
return TFateInstanceType.USER;
case META:
return TFateInstanceType.META;
default:
throw new IllegalStateException("Unknown type "+this);
}
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is cleaner for sure, I like this change.
core/src/main/java/org/apache/accumulo/core/fate/FateInstanceType.java
Outdated
Show resolved
Hide resolved
} | ||
|
||
private static FateInstanceType getType(TFateId opid) { | ||
return FateInstanceType.valueOf(opid.getType().name()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return FateInstanceType.valueOf(opid.getType().name()); | |
return FateInstanceType.fromThrift(opid.getType()); |
could add the following the the FateInstanceType enum and call as above. The impl below uses switch and avoids the string valueOf lookup, may be a bit more efficient.
public static FateInstanceType fromThrift(TFateInstanceType tfit) {
switch (tfit) {
case USER:
return FateInstanceType.USER;
case META:
return FateInstanceType.META;
default:
throw new IllegalStateException("Unknown type "+tfit);
}
}
@Override | ||
public void configureMiniCluster(MiniAccumuloConfigImpl cfg, Configuration hadoopCoreSite) { | ||
cfg.setProperty(Property.TSERV_MAXMEM, "5K"); | ||
cfg.setMemory(ServerType.TABLET_SERVER, 384, MemoryUnit.MEGABYTE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wonder if this is needed because ITs are now writing more data to the new fate table. May be good to open a follow on issue to track down the cause of this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was pretty weird, it was happening during scans according to the logs. It certainly seems related to more data being written and I agree that it should be investigated more. One option to figure it out would be to enable the JVM to dump the heap file on OOM and then we can analyze what's in the file using something like the Eclipse MAT
Co-authored-by: Keith Turner <[email protected]>
* Improve handling of FateInstance enums and conversions * Make fate refs map immutable and store in atomic reference
I dug into the OOM error a little bit and I opened up #4157 where I will put some info in to track the investigation and report findings. I also checked the full IT build and there are a few more test failures so I want to investigate those before merging this. |
This updates 3 tests to account for the existence of the new FateTable as part of the accumulo namespace: MetaSplitIT.java TabletManagementIteratorIT.java WALSunnyDayIT.java
server/manager/src/main/java/org/apache/accumulo/manager/split/SplitTask.java
Outdated
Show resolved
Hide resolved
I'm going to go ahead and merge and open up some follow on issues |
Three tables in the "accumulo." namespace were defined as constants in the three classes in the code. The change gathers the definitions of these three tables into a single enum in the code. The motivation for this change was that the new fate code introduced in apache#4133 needed to know of the all the accumulo tables ids. The way the code was structured if a new accumulo.X table is added in the future someone may not realize the fate code needs to know about it. This change makes the dependency more apparent in the code. Existing constants for the metadata and root table name and id were not inlined in the commit for two reasons. First, it would make this commit hard to review. Second, it would make merging main into the elasticity branch more difficult (its already tricky). Inlining these constants can easily be done as a stand alone commit later when the elasticity branch is ready to merge into main.
As part of apache#4133 the null check for fateRef inside of setManagerState() method in the Manager was inadvertently changed to check if the map size was non empty but the actual check should be that it's null which means not initialized. The map will never be non empty if it is set. This restores the previous check that used to check for a null ref
As part of apache#4133 the null check for fateRef inside of setManagerState() method in the Manager was inadvertently changed to check if the map size was non empty but the actual check should be that it's null which means not initialized. The map will never be non empty if it is set. This restores the previous check that used to check for a null ref
As part of #4133 the null check for fateRef inside of setManagerState() method in the Manager was inadvertently changed to check if the map size was non empty but the actual check should be that it's null which means not initialized. The map will never be non empty if it is set. This restores the previous check that used to check for a null ref
This PR updates the Manager to create two Fate instances, one for store FATE operations against tables in the accumulo namespace (meta) in the existing zookeeper store and another instance to store FATE operations against user tables (user). The thrift API has been modified so that when acquiring a new transaction, the type is returned of the instance used because future operations with the same transaction will need to know which instance was used. A new table has been created
accumulo.fate
that is used to store FATE data for the accumulo store.I also included a second commit which starts the process of reworking the Fate Admin command to be able to use the new Accumulo store as well. This was needed to fix the FateSummaryIT and FateConcurrentIT. The work here is not complete and there is more to do to make it fully featured (such as being able to filter by fate instance type in the command) so I put it into its own commit as we might (probably) want to move this work to a follow on PR but I included it for now to show the direction I was headed. We could also keep what's here so the tests work and create a follow on PR to fix the rest.
For testing, I ran through some of the ITs but not all so I need to kick off a full build. One interesting thing of note was after making changes here the SplitIT (specifically concurrentSplit) test intermittently was causing one of the tservers to die with OOM error. I bumped the default to 384M from 256M and that seemed to solve it.
Issues to still address (likely as follow on Issues):
accumulo.fate
table on upgrade so it exists. The current code handles initializing new instances with the table.