Skip to content

azcopy_sync

GitHub Action edited this page Jan 10, 2025 · 4 revisions

azcopy sync

Replicate source to the destination location

Synopsis

The last modified times are used for comparison. The file is skipped if the last modified time in the destination is more recent. Alternatively, you can use the --compare-hash flag to transfer only files which differ in their MD5 hash. The supported pairs are:

  • Local <-> Azure Blob / Azure File (either SAS or OAuth authentication can be used)
  • Azure Blob <-> Azure Blob (either SAS or OAuth authentication can be used)
  • ADLS Gen2 <-> ADLS Gen2 (either SAS or OAuth authentication can be used)
  • Azure File <-> Azure File (Source must include a SAS or is publicly accessible; SAS authentication should be used for destination)
  • Azure Blob <-> Azure File

The sync command differs from the copy command in several ways:

  1. By default, the recursive flag is true and sync copies all subdirectories. Sync only copies the top-level files inside a directory if the recursive flag is false.
  2. When syncing between virtual directories, add a trailing slash to the path (refer to examples) if there's a blob with the same name as one of the virtual directories.
  3. If the 'deleteDestination' flag is set to true or prompt, then sync will delete files and blobs at the destination that are not present at the source.

Advanced: AzCopy does not support modifications to the source or destination during a transfer.

Please note that if you don't specify a file extension, AzCopy automatically detects the content type of the files when uploading from the local disk, based on the file extension or content.

The built-in lookup table is small but on Unix it is augmented by the local system's mime.types file(s) if available under one or more of these names:

  • /etc/mime.types
  • /etc/apache2/mime.types
  • /etc/apache/mime.types

On Windows, MIME types are extracted from the registry.

By default, sync works off of the last modified times unless you override that default behavior by using the --compare-hash flag. So in the case of Azure File <-> Azure File, the header field Last-Modified is used instead of x-ms-file-change-time, which means that metadata changes at the source can also trigger a full copy.

azcopy sync [flags]

Examples


Sync a single file:

   - azcopy sync "/path/to/file.txt" "https://[account].blob.core.windows.net/[container]/[path/to/blob]"

Same as above, but also compute an MD5 hash of the file content, and then save that MD5 hash as the blob's Content-MD5 property. 

   - azcopy sync "/path/to/file.txt" "https://[account].blob.core.windows.net/[container]/[path/to/blob]" --put-md5

Sync an entire directory including its subdirectories (note that recursive is by default on):

   - azcopy sync "/path/to/dir" "https://[account].blob.core.windows.net/[container]/[path/to/virtual/dir]"
or
  - azcopy sync "/path/to/dir" "https://[account].blob.core.windows.net/[container]/[path/to/virtual/dir]" --put-md5

Sync only the files inside of a directory but not subdirectories or the files inside of subdirectories:

   - azcopy sync "/path/to/dir" "https://[account].blob.core.windows.net/[container]/[path/to/virtual/dir]" --recursive=false

Sync a subset of files in a directory (For example: only jpg and pdf files, or if the file name is "exactName"):

   - azcopy sync "/path/to/dir" "https://[account].blob.core.windows.net/[container]/[path/to/virtual/dir]" --include-pattern="*.jpg;*.pdf;exactName"

Sync an entire directory but exclude certain files from the scope (For example: every file that starts with foo or ends with bar):

   - azcopy sync "/path/to/dir" "https://[account].blob.core.windows.net/[container]/[path/to/virtual/dir]" --exclude-pattern="foo*;*bar"

Sync a single blob:

   - azcopy sync "https://[account].blob.core.windows.net/[container]/[path/to/blob]?[SAS]" "https://[account].blob.core.windows.net/[container]/[path/to/blob]"

Sync a virtual directory:

   - azcopy sync "https://[account].blob.core.windows.net/[container]/[path/to/virtual/dir]?[SAS]" "https://[account].blob.core.windows.net/[container]/[path/to/virtual/dir]" --recursive=true

Sync a virtual directory that has the same name as a blob (add a trailing slash to the path in order to disambiguate):

   - azcopy sync "https://[account].blob.core.windows.net/[container]/[path/to/virtual/dir]/?[SAS]" "https://[account].blob.core.windows.net/[container]/[path/to/virtual/dir]/" --recursive=true

Sync an Azure File directory (same syntax as Blob):

   - azcopy sync "https://[account].file.core.windows.net/[share]/[path/to/dir]?[SAS]" "https://[account].file.core.windows.net/[share]/[path/to/dir]" --recursive=true

Note: if include and exclude flags are used together, only files matching the include patterns are used, but those matching the exclude patterns are ignored.

Options

      --block-size-mb float                                   Use this block size (specified in MiB) when uploading to Azure Storage or downloading from Azure Storage. Default is automatically calculated based on file size. Decimal fractions are allowed (For example: 0.25).
      --check-md5 string                                      Specifies how strictly MD5 hashes should be validated when downloading. This option is only available when downloading. Available values include: NoCheck, LogOnly, FailIfDifferent, FailIfDifferentOrMissing. (default 'FailIfDifferent'). (default "FailIfDifferent")
      --compare-hash string                                   Inform sync to rely on hashes as an alternative to LMT. Missing hashes at a remote source will throw an error. (None, MD5) Default: None (default "None")
      --cpk-by-name string                                    Client provided key by name let clients making requests against Azure Blob storage an option to provide an encryption key on a per-request basis. Provided key name will be fetched from Azure Key Vault and will be used to encrypt the data
      --cpk-by-value                                          False by default. Client provided key by name let clients making requests against Azure Blob storage an option to provide an encryption key on a per-request basis. Provided key and its hash will be fetched from environment variables (CPK_ENCRYPTION_KEY and CPK_ENCRYPTION_KEY_SHA256 must be set).
      --delete-destination string                             Defines whether to delete extra files from the destination that are not present at the source. Could be set to true, false, or prompt. If set to prompt, the user will be asked a question before scheduling files and blobs for deletion. (default 'false'). (default "false")
      --dry-run                                               False by default. Prints the path of files that would be copied or removed by the sync command. This flag does not copy or remove the actual files.
      --exclude-attributes string                             (Windows only) Exclude files whose attributes match the attribute list. For example: A;S;R
      --exclude-path string                                   Exclude these paths when comparing the source against the destination. This option does not support wildcard characters (*). Checks relative path prefix(For example: myFolder;myFolder/subDirName/file.pdf).
      --exclude-pattern string                                Exclude files where the name matches the pattern list. For example: *.jpg;*.pdf;exactName
      --exclude-regex string                                  Exclude the relative path of the files that match with the regular expressions. Separate regular expressions with ';'.
      --force-if-read-only                                    False by default. When overwriting an existing file on Windows or Azure Files, force the overwrite to work even if the existing file has its read-only attribute set.
      --from-to string                                        Optionally specifies the source destination combination. For Example: LocalBlob, BlobLocal, LocalFile, FileLocal, BlobFile, FileBlob, etc.
      --hash-meta-dir --local-hash-storage-mode=HiddenFiles   When using --local-hash-storage-mode=HiddenFiles you can specify an alternate directory to store hash metadata files in (as opposed to next to the related files in the source)
  -h, --help                                                  help for sync
      --include-attributes string                             (Windows only) Include only files whose attributes match the attribute list. For example: A;S;R
      --include-directory-stub                                False by default, includes blobs with the hdi_isfolder metadata in the transfer.
      --include-pattern string                                Include only files where the name matches the pattern list. For example: *.jpg;*.pdf;exactName
      --include-regex string                                  Include the relative path of the files that match with the regular expressions. Separate regular expressions with ';'.
      --local-hash-storage-mode string                        Specify an alternative way to cache file hashes; valid options are: HiddenFiles (OS Agnostic), XAttr (Linux/MacOS only; requires user_xattr on all filesystems traversed @ source), AlternateDataStreams (Windows only; requires named streams on target volume) (default "XAttr")
      --mirror-mode                                           Disable last-modified-time based comparison and overwrites the conflicting files and blobs at the destination if this flag is set to true. Default is false.
      --preserve-permissions                                  False by default. Preserves ACLs between aware resources (Windows and Azure Files, or ADLS Gen 2 to ADLS Gen 2). For Hierarchical Namespace accounts, you will need a container SAS or OAuth token with Modify Ownership and Modify Permissions permissions. For downloads, you will also need the --backup flag to restore permissions where the new Owner will not be the user running AzCopy. This flag applies to both files and folders, unless a file-only filter is specified (e.g. include-pattern).
      --preserve-posix-properties                             False by default. 'Preserves' property info gleaned from stat or statx into object metadata.
      --preserve-smb-info                                     Preserves SMB property info (last write time, creation time, attribute bits) between SMB-aware resources (Windows and Azure Files). On windows, this flag will be set to true by default. If the source or destination is a volume mounted on Linux using SMB protocol, this flag will have to be explicitly set to true. Only the attribute bits supported by Azure Files will be transferred; any others will be ignored. This flag applies to both files and folders, unless a file-only filter is specified (e.g. include-pattern). The info transferred for folders is the same as that for files, except for Last Write Time which is never preserved for folders.
      --put-blob-size-mb float                                Use this size (specified in MiB) as a threshold to determine whether to upload a blob as a single PUT request when uploading to Azure Storage. The default value is automatically calculated based on file size. Decimal fractions are allowed (For example: 0.25).
      --put-md5                                               Create an MD5 hash of each file, and save the hash as the Content-MD5 property of the destination blob or file. (By default the hash is NOT created.) Only available when uploading.
      --recursive                                             True by default, look into sub-directories recursively when syncing between directories. (default true). (default true)
      --s2s-preserve-access-tier                              Preserve access tier during service to service copy. Please refer to [Azure Blob storage: hot, cool, and archive access tiers](https://docs.microsoft.com/azure/storage/blobs/storage-blob-storage-tiers) to ensure destination storage account supports setting access tier. In the cases that setting access tier is not supported, please use s2sPreserveAccessTier=false to bypass copying access tier (default true).  (default true)
      --s2s-preserve-blob-tags                                False by default. Preserve index tags during service to service sync from one blob storage to another.
      --trailing-dot string                                   'Enable' by default to treat file share related operations in a safe manner. Available options: Enable, Disable, AllowToUnsafeDestination. Choose 'Disable' to go back to legacy (potentially unsafe) treatment of trailing dot files where the file service will trim any trailing dots in paths. This can result in potential data corruption if the transfer contains two paths that differ only by a trailing dot (ex: mypath and mypath.). If this flag is set to 'Disable' and AzCopy encounters a trailing dot file, it will warn customers in the scanning log but will not attempt to abort the operation.If the destination does not support trailing dot files (Windows or Blob Storage), AzCopy will fail if the trailing dot file is the root of the transfer and skip any trailing dot paths encountered during enumeration.

Options inherited from parent commands

      --cap-mbps float                      Caps the transfer rate, in megabits per second. Moment-by-moment throughput might vary slightly from the cap. If this option is set to zero, or it is omitted, the throughput isn't capped.
      --log-level string                    Define the log verbosity for the log file, available levels: INFO(all requests/responses), WARNING(slow responses), ERROR(only failed requests), and NONE(no output logs). (default 'INFO'). (default "INFO")
      --output-level string                 Define the output verbosity. Available levels: essential, quiet. (default "default")
      --output-type string                  Format of the command's output. The choices include: text, json. The default value is 'text'. (default "text")
      --skip-version-check                  Do not perform the version check at startup. Intended for automation scenarios & airgapped use.
      --trusted-microsoft-suffixes string   Specifies additional domain suffixes where Azure Active Directory login tokens may be sent.  The default is '*.core.windows.net;*.core.chinacloudapi.cn;*.core.cloudapi.de;*.core.usgovcloudapi.net;*.storage.azure.net'. Any listed here are added to the default. For security, you should only put Microsoft Azure domains here. Separate multiple entries with semi-colons.

SEE ALSO

  • azcopy - AzCopy is a command line tool that moves data into and out of Azure Storage.
Auto generated by spf13/cobra on 10-Jan-2025
Clone this wiki locally