bacula-sd Mutex lock failure.
Summary
Reproducibility | Platform | OS | OS Version | Product Version |
---|---|---|---|---|
always | AMD64 | NetBsd | 10.0_BETA | bacula 13.0.1 |
Description
On latest version of NetBsd pthread_mutex_lock checks that pthread_mutex_init has been called on pthread_mutex_t structure. If not, a EINVAL error is returned. A bacula job using a spool file falls over trying to spool data to a volume with the bacula-sd daemon falling over with a Segmentation Error.
Spooling data ... backup-sd Committing spooled data to Volume "backup-vol0975". Despooling 42,448,204,904 bytes ... ABORTING via segfault due to ERROR in lockmgr.c:111 Mutex lock failure. ERR=Invalid argument backup.ecs.vuw.ac.nz-dir
Debugging shows
backup-sd: spool.c:132-145318 Committing spooled data backup-sd: spool.c:194-145318 Despooling data backup-sd: block_util.c:174-145318 New block adata=0 len=64512 block=7186c3870900 backup-sd: acquire.c:671-145318 Attach 0xc38b5628 to dev None backup-sd: ABORTING via segfault due to ERROR in lockmgr.c:111 Mutex lock failure. ERR=Invalid argument 15-Mar 14:30 backup.ecs.vuw.ac.nz-sd: ABORTING via segfault due to ERROR in lockmgr.c:112 Mutex lock failure. ERR=Invalid argument Bacula interrupted by signal 11: Segmentation violation Kaboom! bacula-sd-noopt, backup.ecs.vuw.ac.nz-sd got signal 11 - Segmentation violation at 15-Mar-2023 14:30:23. Attempting traceback.
In bacula-13.0.1/src/stored/spool.c (Under the nice comment) a new file_dev is created, but no initialization is called on it. In bacula-13.0.1/src/stored/init_dev.c device_generic_init is called to initialize everything, including mutex's.
spool.c 226 /* 227 * This is really quite kludgy and should be fixed some time. 228 * We create a dev structure to read from the spool file 229 * in rdev and rdcr. 230 */ 231 rdev = New(file_dev);
A fix that works but maybe not the best is to call pthread_mutex_init on the dcrs_mutex (which is the one trying to be locked) in spool.c. With some code copied from init_dec.c DEVICE::device_generic_init function,bacula-sd-spool.c-patch the patch file I have attached works.
Probably a better solution would be to modify init_dev.c void DEVICE::device_generic_init(JCR *jcr, DEVRES *device) so it could be called with a null *device. Thus it could be called and everything else would be correctly initialized.