Data Archiving: Best Practices

Data archiving allows offloading of backup data to supported tape libraries and cloud storage like Amazon S3. For instructions on using this feature with Catalogic DPX and vStor, refer to the Catalogic DPX User’s Guide, and ensure all components are configured correctly.

In addition, review the following tips for data archiving:

  • Verify that your archive storage is supported as per the Catalogic DPX Compatibility Guide.

  • Follow your vendor's system requirements and best practices for configuring archive storage.

  • A single Catalogic vStor should be linked to one Enterprise only. Sharing Catalogic DPX Archive storage across multiple Enterprises is not supported.

  • Catalogic vStor can be used for both backup and archiving; however, ensure you have sufficient I/O adapters and consider using a dedicated HBA for each storage unit. You can use a dedicated host bus adapter (HBA) to connect additional storage.

Restrictions with data archiving

Review the following restrictions before archiving your data:

  • Data archiving creates one task per volume contained within the snapshot, regardless of the Max Devices (Devices) value in the Set Job Destination Options dialog.

  • Archiving is incompatible with source volumes using data deduplication due to NDMP file history requirements. For more details, See Enabling NDMP file history.

  • Features like Instant Access and Bare Metal Restore are unavailable for restored archives.

Best practice for defining data archive jobs with Catalogic DPX

Review the following tips before defining a data archive job in the Java®-based DPX Management Interface:

  • Use separate data archiving media pools for each retention—weekly, monthly, yearly, and so on.

  • Archive frequencies typically range from weekly to monthly. Daily archiving should be reserved for compliance needs and use the incremental archive.

  • Schedule backup tasks, data archiving tasks, and catalog condensing for different time of days as data archiving tasks and catalog condensing tasks may take several hours.

  • To simultaneously manage backup and archiving for the same dataset, it's efficient to consolidate these processes into one job. This combined backup-archive job could be configured to perform backup operations during regular business hours, such as Monday to Friday from 8 a.m. to noon and 1 p.m. to 5 p.m., and execute archiving tasks weekly, perhaps every Sunday at 10 p.m.

To schedule a job:

  1. Open the Java-based DPX Management Interface.

  2. Select the Backup tab from the function tab bar.

  3. Navigate to 'Block' and then Block Backup Wizard in the task pane to initiate the wizard.

  4. Choose to create a new job or select an existing one.

  5. Proceed to the Save step within the wizard.

  6. Click Schedule Job to set up the timing for the job.

  7. In the Job Schedule section, edit the details of block backup jobs and archive jobs as needed, within the same job definition.

Estimating Archiving Time

Estimate archiving duration by considering data size and transfer rate. For instance, a 50 GB backup at 150 MB/s should complete in approximately 7 minutes. For larger data volumes, consider more tape drives or libraries. You can estimate archiving time with cloud storage using similar method.

Tape Libraries

For data archiving with tape libraries, ensure to follow these guidelines:

  • Establish a direct and exclusive connection between your tape library device and the Catalogic vStor appliance to facilitate faster data transfers. Avoid sharing the tape library with other servers to prevent slowing down the archiving process and creating additional network traffic.

  • Organize your tapes by job type and frequency of archiving. For instance, designate and label tapes specifically for Block Backup (Weekly), Block Backup (Monthly), and Image Backup (Weekly). This prevents the mixing of archive data from different backup types and frequencies on the same tape, which can complicate data management and retrieval.

Enabling NDMP file history

Data archiving for any job types requires the NDMP file history.

Take the following steps to enable the NDMP File History for the image backups or the NDMP backups before defining data archive jobs:

  1. In the Java-based DPX Management Interface, open the Backup tab from the function tab bar.

  2. In the task pane, click Backup Modes: image or NDMP.

  3. In the task pane, click Other Tasks: Set Source Options.

  4. In the Set Job Source Options dialog, open the Source tab.

  5. Set the NDMP File History Handling to either Process History on Local Client or Process History on Master Server.

  6. Click OK to close the dialog.

NDMP file history is also necessary archiving block backup snapshots. Even if it was disabled on the original block backup job, the archive job generates and stores a new NDMP file history in the DPX catalog after the backup instance is archived.

You can also take the following steps to enable the NDMP File History for the block backups:

  1. In the task pane, click Backup Modes: Block.

  2. In the task pane, click Job Tasks: Define Block Backup Wizard to create a new block backuparchive job or edit an existing job.

  3. In the Block Backup Wizard dialog, proceed to the Job Options page and open the NDMP tab.

  4. Set the NDMP File History Handling to either Process History on Local Client or Process History on Master Server.

  5. Complete the Block Backup Wizard.

Tip. You cannot use the NDMP File History and data archiving if the source volume for the original data is using deduplication.

Recovering files or directories from Block Archives

  • When restoring files or directories from Block Backups, Catalogic DPX determines if the snapshot exists in the Catalogic vStor Storage appliance uses that snapshot to restore the data.

  • If the snapshot has expired due to extended retention settings, DPX will automatically revert to the archived data on the cloud or tape in a single-step restore.

Recovering VMDK disks from Agentless Archives

Review the following items before recovering VNDK disks from agentless archives:

  • Restore jobs can fail when the original Catalogic vStor is selected as the destination and the Agentless Backup job has not expired. To avoid this issue, select a different Catalogic vStor as the destination.

  • Create a separate volume on the vStor for recoveries as the restore job will fail if you select a volume that already contains copies of files as your destination.

Recovery is a two-step process: first, restore the backup instance from the media to a vStor volume, then restore the VMDKs from this volume to complete the recovery.

Last updated