Skip to content

Proposal: Make Collections Top Level Entities

Ivan Kirillov edited this page Aug 11, 2015 · 32 revisions

Status: Open
Comment Period Closes: August 20th, 2015
Affects Backwards Compatibility: Yes
Relevant Issues: https://github.com/MAECProject/schemas/issues/96

Background Information

Because of their extensive use of nesting (e.g., Collections -> Behavior_Collections -> Behavior_Collection), Bundle Collections are cumbersome to use and perhaps unnecessarily complicated. And, as currently defined, Bundle Collections and top level container elements (e.g., Actions, Behaviors, Objects) serve similar functions - both store or reference a single type of MAEC entity (e.g., only Actions or only Objects).

Related Proposals

This proposal is related to the following proposed changes to the schema:

Proposal

We propose to deprecate entity-specific collections in favor of defining more general Collections as top-level entities in a MAEC Package. An "entity_type" field would optionally define the type of entity captured in the Collection. General, top level Collections would provide flexibility by permitting the capture of collections of any MAEC entity(ies), including Malware Subjects.


The following existing schema types would be deprecated: maecBundle:CollectionsType, maecBundle:BehaviorCollectionListType, maecBundle:BehaviorCollectionType, maecBundle:ActionCollectionListType, maecBundle:ActionCollectionType, maecBundle:ObjectCollectionListType, maecBundle:ObjectCollectionType, maecBundle:CandidateIndicatorCollectionListType, maecBundle:CandidateIndicatorCollectionType, maecBundle:BaseCollectionType.


A new CollectionType schema type would be defined in the MAEC Package schema with the following fields:

Field Type Multiplicity Description
@id xs:QName 1 The id field specifies a unique identifier for the Collection.
@type maecVocabs:CollectionTypeEnum 0-1 The type field specifies the nature of the contents of the Collection, via the CollectionTypeEnum.
@maec_entity_type maecVocabs:CollectionEntityTypeEnum 1 The required maec_entity_type field specifies the type of MAEC entity that is captured in the Collection, via the CollectionEntityTypeEnum. Example types would be 'objects' or 'various'. The default value is 'various'.
Name xs:string 0-1 The Name field specifies the name of the Collection.
Entity_Reference maecCore:EntityReferenceType 0-* The Entity_Reference field references an existing MAEC entity that is captured in the Collection, via its ID.

There may be cases where a Collection must be associated with a particular Malware Subject. To handle this requirement, we propose using a first-class relationship (see example).


A new enumeration of possible Collection types, the CollectionTypeEnum, would be created with the following values:

Value Description
file system entities The 'file system entities' value specifies that the Collection contains ONLY file system related entities; for example, this could include MAEC Actions that operate on files and/or CybOX File Objects.
network entities The 'network entities' value specifies that the Collection contains ONLY network related entities; for example, this could include MAEC Actions that operate on sockets and/or CybOX Address Objects.
process entities The 'process entities' value specifies that the Collection contains ONLY operating system process related entities; for example, this could include MAEC Actions that operate on processes and/or CybOX Process Objects.
memory entities The 'process entities' value specifies that the Collection contains ONLY memory related entities; for example, this could include MAEC Actions that operate on system memory and/or CybOX Memory Objects.
ipc entities The 'ipc entities' value specifies that the Collection contains ONLY interprocess-communication related entities; for example, this could include MAEC Actions that operate on mutexes and/or CybOX Mutex Objects.
device entities The 'device entities' value specifies that the Collection contains ONLY system device related entities; for example, this could include MAEC Actions that operate on disks and/or CybOX Disk Objects.
registry entities The 'registry entities' value specifies that the Collection contains ONLY Windows registry related entities; for example, this could include MAEC Actions that operate on registry keys and/or CybOX Windows Registry Key Objects.
service entities The 'service entities' value specifies that the Collection contains ONLY Windows service related entities; for example, this could include MAEC Actions that operate on services and/or CybOX Windows Service Objects.
potential indicators The 'potential indicators' value specifies that the Collection contains entities that could serve as potential indicators for a malware instance; for example, this could include specific CybOX File Objects that are created by the malware instance on a host system.

A new enumeration of possible MAEC entity types that can be captured as part of a collection, the CollectionEntityTypeEnum, would be created with the following values:

Value Description
malware subjects The 'malware subjects' value specifies that the collection contains ONLY MAEC Malware Subjects.
actions The 'actions' value specifies that the collection contains ONLY MAEC Malware Actions.
objects The 'objects' value specifies that the collection contains ONLY CybOX Objects.
behaviors The 'behaviors' value specifies that the collection contains ONLY MAEC Behaviors.
process trees The 'process trees' value specifies that the collection contains ONLY MAEC Process Trees.
various The 'various' value specifies that the collection contains various types of entities, such as Malware Actions AND CybOX Objects, for example.

Example

This example assumes that all related proposals will be implemented.

<MAEC_Package>
  <Collections>
    <Collection id="collection-1" type="network entities" maec_entity_type="actions">
      <Name>Test collection of network actions</Name>
      <Entity_Reference entity_idref="action-1"/>
      <Entity_Reference entity_idref="action-2"/>
      <Entity_Reference entity_idref="action-3"/>
    </Collection>
  </Collections>
  
  <Malware_Subjects>
    <Malware_Subject id="malware-subject-1">
     ...
    </Malware_Subject>
  </Malware_Subjects>
 
  <Relationships>
    <Relationship id="relationship-1" source_id="malware-subject-1" target_id="collection-1">
      <Type>belongs to</Type>
    </Relationship>
  </Relationships>
</MAEC_Package>

Impact

This change will not be backward compatible and is one of several revisions planned in new major version.

Requested Feedback

  1. Should Collections be top-level entities?
  2. Should Collections be able to capture any set of related entities?
  3. Is the CollectionType schema type reasonably defined?
  4. Is the maec_entity_type field useful and necessary?
  5. Do the values in the CollectionTypeEnum make sense? Are there any values that are missing and should be added?
  6. Should Relationships be used to associate Collections with Malware Subjects?
  7. Are there alternative solutions to making Collections more meaningful and easier to use?
Clone this wiki locally