Skip to content

Cloud storage not creating archive vol sub-directory when jobs contains no files and job is first job on volume (intermittently)

I have some jobs which backup to a minio cloud storage device. The nature of the target fileset of these jobs cause them to change infrequently. Hence diffs and incrementals can regularly have no files to backup. The job is failing at the labelling stage because the volume's archive sub-directory (e.g. /opt/bacula/backups/minio/Obj-10373) does not exist, so it hangs the storage. Over a period of about 6 months, it happened about once a month, but the last week it happened every other day.

basically something like this occurs in the logs;

27-Oct 10:35 bacula-sd JobId 57260: Warning: label.c:409 Open Cloud device "MinioChanger-Dev1" (/opt/bacula/backups/minio) Volume "Obj-10373" failed: ERR=unknown error27-Oct 10:35 bacula-sd JobId 57260: Warning: label.c:409 Open Cloud device "MinioChanger-Dev1" (/opt/bacula/backups/minio) Volume "Obj-10373" failed: ERR=unknown error
27-Oct 10:35 bacula-sd JobId 57260: Warning: mount.c:216 Open of Cloud device "MinioChanger-Dev1" (/opt/bacula/backups/minio) Volume "Obj-10373" failed: ERR=unknown error
27-Oct 10:35 bacula-sd JobId 57260: Marking Volume "Obj-10373" in Error in Catalog.
*list  jobid=57260
+--------+--------------------+---------------------+------+-------+----------+----------+-----------+
| JobId  | Name               | StartTime           | Type | Level | JobFiles | JobBytes | JobStatus |
+--------+--------------------+---------------------+------+-------+----------+----------+-----------+
| 57,260 | bhs_apache_configs | 2025-10-27 10:35:13 | B    | D     |        0 |        0 | f         |
+--------+--------------------+---------------------+------+-------+----------+----------+-----------+
*

the volume is in error state:

*list volume=Obj-10373
+---------+------------+-----------+---------+----------+----------+--------------+---------+------+-----------+----------------------+---------+----------+-----------+
| MediaId | VolumeName | VolStatus | Enabled | VolBytes | VolFiles | VolRetention | Recycle | Slot | InChanger | MediaType            | VolType | VolParts | ExpiresIn |
+---------+------------+-----------+---------+----------+----------+--------------+---------+------+-----------+----------------------+---------+----------+-----------+
|  10,373 | Obj-10373  | Error     |       1 |        0 |        0 |   31,536,000 |       0 |    0 |         0 | MinioObjectMediaType |      14 |        0 | NULL      |
+---------+------------+-----------+---------+----------+----------+--------------+---------+------+-----------+----------------------+---------+----------+-----------+

so label is this code:


--- a/bacula/src/stored/label.c
+++ b/bacula/src/stored/label.c
@@ -405,9 +405,9 @@
    Dmsg1(100, "New VolName=%s\n", VolName);
    if (!dev->open_device(dcr, OPEN_READ_WRITE)) {
       /* If device is not tape, attempt to create it */
       if (dev->is_tape() || !dev->open_device(dcr, CREATE_READ_WRITE)) {
          Jmsg4(dcr->jcr, M_WARNING, 0, _("Open %s device %s Volume \"%s\" failed: ERR=%s"),
                dev->print_type(), dev->print_name(), dcr->VolumeName, dev->bstrerror());
          goto bail_out;
       }
    }

and then the code in cloud_dev.c that creates the directory during open_device:

   /* If create make directory with Volume name */
   if (part <= 0 && omode == CREATE_READ_WRITE) {
      Dmsg1(dbglvl, "=== makedir=%s\n", archive_name.c_str());
      if (!makedir(dcr->jcr, archive_name.c_str(), 0740)) {
         berrno be;
         if (errmsg[0] == 0) {
            Mmsg2(errmsg, _("Could not make dir %s. %s"), archive_name.c_str(), be.bstrerror());
         }
         Dmsg2(dbglvl, _("Could not make dir %s. %s"), archive_name.c_str(), be.bstrerror());
         Leave(dbglvl);
         return false;
      }
   }

my supposition is that maybe part is not "<= 0" in some of these cases.

As troubleshooting I set the max vols per volume to 1, so every job creates a new cloud volume, but this ran successfully for a few days without any errors on zero files jobs, so its not doing it all the time.

I have added tracing, and am waiting for a reproduction with the full tracing, but thought I would raise this to see if I could get some suggestions on reliably reproducing. Cheers

Edited by Tom Hodder