Skip to content

Conversation

cshannon
Copy link
Contributor

@cshannon cshannon commented Jan 7, 2024

This PR updates the Manager to create two Fate instances, one for store FATE operations against tables in the accumulo namespace (meta) in the existing zookeeper store and another instance to store FATE operations against user tables (user). The thrift API has been modified so that when acquiring a new transaction, the type is returned of the instance used because future operations with the same transaction will need to know which instance was used. A new table has been created accumulo.fate that is used to store FATE data for the accumulo store.

I also included a second commit which starts the process of reworking the Fate Admin command to be able to use the new Accumulo store as well. This was needed to fix the FateSummaryIT and FateConcurrentIT. The work here is not complete and there is more to do to make it fully featured (such as being able to filter by fate instance type in the command) so I put it into its own commit as we might (probably) want to move this work to a follow on PR but I included it for now to show the direction I was headed. We could also keep what's here so the tests work and create a follow on PR to fix the rest.

For testing, I ran through some of the ITs but not all so I need to kick off a full build. One interesting thing of note was after making changes here the SplitIT (specifically concurrentSplit) test intermittently was causing one of the tservers to die with OOM error. I bumped the default to 384M from 256M and that seemed to solve it.

Issues to still address (likely as follow on Issues):

  1. As noted above, need to finish fixing the Fate Admin command api to make sure it's fully compatible with ZK and Accumulo instances/stores and will support filtering correctly by instance type.
  2. Conditional mutations need to be added to the Accumulo store which is being addressed already by: Update FATE AccumuloStore to use conditional mutations #4114
  3. Upgrades need to be handled. Existing instances will need to create the new accumulo.fate table on upgrade so it exists. The current code handles initializing new instances with the table.
  4. Currently the Accumulo store uses the AgeOff store (just like zookeeper) so on initialization the list of existing transactions is loaded. This means while the Manager is still starting up and initializing the store it will scan the fate table which causes the Tablet sever that accepts the scan to log an exception. This exception is due to the tablet server loading a tablet and trying to send a message back to the Manager over the manager RPC channel that the tablet was loaded but because the Manager hasn't finished loading, the Tsever can't connect back to the Manager yet so an error is logged. Eventually a retry happens and the message is passed back successfully so this is not really a huge issue as the scan completes successfully and eventually the message gets sent back on retry but it does cause an ugly error message on start up. I haven't not addressed this yet because this behavior is very likely to change with the update to WIP distributed FATE #3964 anyways

This commit updates the Manager to create two fate instances, one for
store FATE operations against tables in the accumulo namespace (meta) in
the existing zookeeper store and another fate instance to store FATE
operations against user tables (user). The thrift API has been modified
so that when acquiring a new transaction, the type is returned of the
instance used because future operations with the same transaction will
need to know which instance was used.
The Fate Admin command needs to be updated now that multiple instance
types are used for Fate operations. This commit fixes the ITs that call
the admin command operations but there is still more work to be done to
complete all the changes.
@cshannon cshannon requested a review from keith-turner January 7, 2024 18:17
@cshannon cshannon self-assigned this Jan 7, 2024
@cshannon
Copy link
Contributor Author

cshannon commented Jan 7, 2024

I started a full IT build to see what breaks after the changes.

@cshannon
Copy link
Contributor Author

I started a full IT build to see what breaks after the changes.

There were some failures, most notably NamespacesIT which all tests were broken due to the new table creation. I fixed that and will kick off a new build to get an updated list of tests that needed looking at to see if the rest are real failures or not.

Copy link
Contributor

@keith-turner keith-turner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes look good. Need to open a few follow on issues.

Comment on lines 1076 to 1079
initializeFateInstance(context, FateInstanceType.META,
new ZooStore<>(getZooKeeperRoot() + Constants.ZFATE, context.getZooReaderWriter()));
initializeFateInstance(context, FateInstanceType.USER,
new AccumuloStore<>(context, FateTable.NAME));
Copy link
Contributor

@keith-turner keith-turner Jan 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After these are setup, the map never changes. Could use an immutable map like the following.

Suggested change
initializeFateInstance(context, FateInstanceType.META,
new ZooStore<>(getZooKeeperRoot() + Constants.ZFATE, context.getZooReaderWriter()));
initializeFateInstance(context, FateInstanceType.USER,
new AccumuloStore<>(context, FateTable.NAME));
var metaInstance = initializeFateInstance(context, FateInstanceType.META,
new ZooStore<>(getZooKeeperRoot() + Constants.ZFATE, context.getZooReaderWriter()));
var userInstance = initializeFateInstance(context, FateInstanceType.USER,
new AccumuloStore<>(context, FateTable.NAME));
fateRefs = Map.of(FateInstanceType.META, metaInstance, FateInstanceType.USER, userInstance);

and change the declaration of fateRefs to

 private volatile Map<FateInstanceType,Fate<Manager>> fateRefs = Map.of();

One downside of this that variable can not be final anymore. An upside is its nice having the map be immutable when its never expected to change.

Copy link
Contributor Author

@cshannon cshannon Jan 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@keith-turner - I went ahead and decided to do this but used an AtomicReference instead of volatile. I went with that approach because I could make that reference final still and then ensure things are expected by adding some state checks when initializing and also retrieving the reference to make sure on initialization it's null and on retrieval it exists.

It's all private code so in theory we shouldn't need to check the expected state each time but I thought it might be good in case future changes causes those assumptions to change. Let me know what you think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@keith-turner - I went ahead and decided to do this but used an AtomicReference instead of volatile. I went with that approach because I could make that reference final still and then ensure things are expected by adding some state checks when initializing and also retrieving the reference to make sure on initialization it's null and on retrieval it exists.

That all looks good. The compare and set check that ensures its null when setting the map is nice, should help avoid future bugs where we create the fate instances multiple times.


goalMessage += "Offline table " + tableId;
manager.fate().seedTransaction(op.toString(), opid,
manager.fate(type).seedTransaction(op.toString(), tid,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alot of the case statements have validation at the beggining. We could add validation that checks the the actual table id and fate type are sensible. I think this would be best done in a follow on PR.

}

private long beginFateOperation() throws ThriftSecurityException, TException {
private TFateId beginFateOperation(TFateInstanceType type)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was really surprised by how minimal the changes were to this class.

Comment on lines 396 to 398
TFateInstanceType t = tableOrNamespaceName.startsWith(Namespace.ACCUMULO.name())
? TFateInstanceType.META : TFateInstanceType.USER;
opid = beginFateOperation(t);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we add a toThrift() method to FateInstanceType could do :

Suggested change
TFateInstanceType t = tableOrNamespaceName.startsWith(Namespace.ACCUMULO.name())
? TFateInstanceType.META : TFateInstanceType.USER;
opid = beginFateOperation(t);
opid = beginFateOperation(FateInstanceType.toType(tableOrNamespaceName).toThrift());

Could do the following for the toThrift() method.

public enum FateInstanceType {
  public TFateInstanceType toThrift(){
    switch (this) {
      case USER:
        return TFateInstanceType.USER;
      case META:
        return TFateInstanceType.META;
      default:
        throw new IllegalStateException("Unknown type "+this);
    }
  }
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is cleaner for sure, I like this change.

}

private static FateInstanceType getType(TFateId opid) {
return FateInstanceType.valueOf(opid.getType().name());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return FateInstanceType.valueOf(opid.getType().name());
return FateInstanceType.fromThrift(opid.getType());

could add the following the the FateInstanceType enum and call as above. The impl below uses switch and avoids the string valueOf lookup, may be a bit more efficient.

  public static FateInstanceType fromThrift(TFateInstanceType tfit) {
    switch (tfit) {
      case USER:
        return FateInstanceType.USER;
      case META:
        return FateInstanceType.META;
      default:
        throw new IllegalStateException("Unknown type "+tfit);
    }
  }

@Override
public void configureMiniCluster(MiniAccumuloConfigImpl cfg, Configuration hadoopCoreSite) {
cfg.setProperty(Property.TSERV_MAXMEM, "5K");
cfg.setMemory(ServerType.TABLET_SERVER, 384, MemoryUnit.MEGABYTE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonder if this is needed because ITs are now writing more data to the new fate table. May be good to open a follow on issue to track down the cause of this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was pretty weird, it was happening during scans according to the logs. It certainly seems related to more data being written and I agree that it should be investigated more. One option to figure it out would be to enable the JVM to dump the heap file on OOM and then we can analyze what's in the file using something like the Eclipse MAT

cshannon and others added 3 commits January 12, 2024 07:33
* Improve handling of FateInstance enums and conversions
* Make fate refs map immutable and store in atomic reference
@cshannon
Copy link
Contributor Author

I dug into the OOM error a little bit and I opened up #4157 where I will put some info in to track the investigation and report findings.

I also checked the full IT build and there are a few more test failures so I want to investigate those before merging this.

This updates 3 tests to account for the existence of the new FateTable
as part of the accumulo namespace:

MetaSplitIT.java
TabletManagementIteratorIT.java
WALSunnyDayIT.java
@cshannon
Copy link
Contributor Author

I'm going to go ahead and merge and open up some follow on issues

@cshannon cshannon merged commit 738cabb into apache:elasticity Jan 12, 2024
@cshannon cshannon deleted the multi-fate branch January 12, 2024 23:03
keith-turner added a commit to keith-turner/accumulo that referenced this pull request Jan 13, 2024
Three tables in the "accumulo." namespace were defined as constants in
the three classes in the code. The change gathers the definitions of
these three tables into a single enum in the code. The motivation for
this change was that the new fate code introduced in apache#4133 needed to
know of the all the accumulo tables ids.  The way the code was
structured if a new accumulo.X table is added in the future someone may
not realize the fate code needs to know about it.  This change makes the
dependency more apparent in the code.

Existing constants for the metadata and root table name and id were not
inlined in the commit for two reasons.  First, it would make this commit
hard to review.  Second, it would make merging main into the elasticity
branch more difficult (its already tricky).  Inlining these constants
can easily be done as a stand alone commit later when the elasticity
branch is ready to merge into main.
cshannon added a commit to cshannon/accumulo that referenced this pull request Jan 20, 2024
As part of apache#4133 the null check for fateRef inside of setManagerState()
method in the Manager was inadvertently changed to check if the map size
was non empty but the actual check should be that it's null which means
not initialized. The map will never be non empty if it is set.

This restores the previous check that used to check for a null ref
cshannon added a commit to cshannon/accumulo that referenced this pull request Jan 20, 2024
As part of apache#4133 the null check for fateRef inside of setManagerState()
method in the Manager was inadvertently changed to check if the map size
was non empty but the actual check should be that it's null which means
not initialized. The map will never be non empty if it is set.

This restores the previous check that used to check for a null ref
cshannon added a commit that referenced this pull request Jan 20, 2024
As part of #4133 the null check for fateRef inside of setManagerState()
method in the Manager was inadvertently changed to check if the map size
was non empty but the actual check should be that it's null which means
not initialized. The map will never be non empty if it is set.

This restores the previous check that used to check for a null ref
@ctubbsii ctubbsii added this to the 4.0.0 milestone Jul 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants