Here's the storage section from my bacula-dir.conf
Storage {
Name = File
# Do not use "localhost" here
Address = ops.jokefire.com # N.B. Use a fully qualified name here
SDPort =9103
Password ="secret"
Device = FileStorage
Media Type = File
TLS Certificate =/etc/pki/tls/certs/ops.jokefire.com.crt
TLS Key =/etc/pki/tls/private/ops.jokefire.com.key
TLS CA Certificate File =/etc/pki/CA/certs/ca.crt
TLS Enable = yes
TLS Require = yes
}
Here's status client:
[root@ops:/etc/bacula] #bconsole
Connecting to Director ops.jokefire.com:91011000 OK: ops.jokefire.com Version:5.2.13(19 February 2013)
Enter a period to cancel a command.*st client
Automatically selected Client: ops.jokefire.com
Connecting to Client ops.jokefire.com at ops.jokefire.com:9102
ops.jokefire.com Version:5.2.13(19 February 2013) x86_64-unknown-linux-gnu redhat
Daemon started 07-Dec-1312:56. Jobs: run=0 running=0.
Heap: heap=1,024,000 smbytes=198,355 max_bytes=210,031 bufs=152 max_bufs=166
Sizeof: boffset_t=8 size_t=8 debug=0 trace=0
Running Jobs:
JobId 5 Job ops.jokefire.com.2013-12-07_12.57.06_03 is running.
Full Backup Job started:07-Dec-1312:57
Files=7,588 Bytes=700,237,644 Bytes/sec=1,075,633 Errors=0
Files Examined=7,613
Processing file:/var/account/pacct.3.gz
SDReadSeqNo=5 fd=5
Director connected at:07-Dec-1313:08====
Terminated Jobs:
JobId Level Files Bytes Status Finished Name
======================================================================86 Full 249,1986.704 G OK 01-Dec-1303:20 ops.jokefire.com
90 Full 1377.4 M OK 01-Dec-1309:33 Jokefire_BackupCatalog
93 Incr 00 Error 02-Dec-1312:54 ops.jokefire.com
1 Full 249,2656.711 G OK 03-Dec-1303:22 ops.jokefire.com
5 Full 1170.1 M OK 03-Dec-1315:45 Jokefire_BackupCatalog
6 Incr 18,175847.8 M OK 04-Dec-1303:32 ops.jokefire.com
10 Full 1197.0 M OK 04-Dec-1305:48 Jokefire_BackupCatalog
11 Incr 1,127728.2 M OK 05-Dec-1303:08 ops.jokefire.com
15 Full 1215.5 M OK 05-Dec-1303:47 Jokefire_BackupCatalog
19121.497 K OK 06-Dec-1321:55 RestoreFiles
====
And here's status storage:
*st storage
Automatically selected Storage: File
Connecting to Storage daemon File at ops.jokefire.com:9103
ops.jokefire.com Version:5.2.13(19 February 2013) x86_64-unknown-linux-gnu redhat
Daemon started 07-Dec-1312:56. Jobs: run=0, running=0.
Heap: heap=528,384 smbytes=176,909 max_bytes=177,103 bufs=120 max_bufs=122
Sizes: boffset_t=8 size_t=8 int32_t=4 int64_t=8 mode=0,0
Running Jobs:
Writing: Full Backup job ops.jokefire.com JobId=5 Volume="jf-backup-tape-0001"
pool="Default" device="FileStorage"(/backup/tapes)
spooling=0 despooling=0 despool_wait=0
Files=9,260 Bytes=737,070,299 AveBytes/sec=630,071 LastBytes/sec=291,475
FDReadSeqNo=91,891 in_msg=64937 out_msg=5 fd=5====
Jobs waiting to reserve a drive:====
Terminated Jobs:
JobId Level Files Bytes Status Finished Name
===================================================================7 Incr 141,7632.790 G OK 04-Dec-1304:26 beta.jokefire.com
8 Incr 62,6901.250 G OK 04-Dec-1305:04 chef.jokefire.com
9 Incr 823558.5 M OK 04-Dec-1305:26 logs.jokefire.com
10 Full 1197.0 M OK 04-Dec-1305:48 Jokefire_BackupCatalog
11 Incr 1,127728.3 M OK 05-Dec-1303:08 ops.jokefire.com
12 Incr 149,7662.811 G OK 05-Dec-1303:36 beta.jokefire.com
13 Incr 515267.5 M OK 05-Dec-1303:38 chef.jokefire.com
14 Incr 1,070903.1 M OK 05-Dec-1303:44 logs.jokefire.com
15 Full 1215.5 M OK 05-Dec-1303:47 Jokefire_BackupCatalog
1900 OK 06-Dec-1321:55 RestoreFiles
====
Device status:
Device "FileStorage"(/backup/tapes) is mounted with:
Volume: jf-backup-tape-0001
Pool: Default
Media type: File
Total Bytes=737,888,459 Blocks=11,438 Bytes/block=64,512
Positioned at File=0 Block=737,888,458======
Used Volume status:
jf-backup-tape-0001 on device "FileStorage"(/backup/tapes)
Reader=0 writers=1 reserves=0 volinuse=1===
And now the exciting part!
My first successful SSL backup!
5 Full 307,9637.177 G OK 07-Dec-1316:20 ops.jokefire.com
And my first successful restore:
Build OS: x86_64-unknown-linux-gnu redhat
JobId:6
Job: RestoreFiles.2013-12-07_16.36.37_43
Restore Client: ops.jokefire.com
Start time:07-Dec-201316:36:39
End time:07-Dec-201316:36:53
Files Expected:1
Files Restored:1
Bytes Restored:504
Rate:0.0 KB/s
FD Errors:0
FD termination status: OK
SD termination status: OK
Termination: Restore OK
This is the file that I restored:
-rw-rw-rw-1 root root 504 Dec 222:52/backup/tapes/bacula-restores/etc/fstab
Now only to add the remote clients!
I'm taking a break before I get to this step. But I thank you all and Ana in particular for all the hard work and advice that got me to this stage so far.
I am optimistic that I can get the clients working with the remote clients as well. But I hope you won't hold it against me if I ping the list again should I run into any issues with that.
Have you configured storage daemon with TLS? In bacula-dir.conf, you also need to configure storage with TLS in the same way you did for the filedaemon:
Storage {
Name = File
# Do not use "localhost" here
Address = ops.jokefire.com # N.B. Use a fully qualified name here
SDPort =9103
Password ="secret"
Device = FileStorage
Media Type = File
I have some progress to report. Last night I was able to follow the steps that were provided by Ana to recreate the certs. That got me as far as logging into bconsole:
[root@ops:~] #bconsole
Connecting to Director ops.jokefire.com:91011000 OK: ops.jokefire.com Version:5.2.13(19 February 2013)
Enter a period to cancel a command.*
And I can connect to the client:
*st client
Automatically selected Client: ops.jokefire.com
Connecting to Client ops.jokefire.com at ops.jokefire.com:9102
ops.jokefire.com Version:5.2.13(19 February 2013) x86_64-unknown-linux-gnu redhat
Daemon started 06-Dec-1322:12. Jobs: run=0 running=0.
Heap: heap=262,144 smbytes=26,654 max_bytes=26,801 bufs=72 max_bufs=73
Sizeof: boffset_t=8 size_t=8 debug=0 trace=0
Running Jobs:
Director connected at:06-Dec-1322:16
No Jobs running.====
Terminated Jobs:
JobId Level Files Bytes Status Finished Name
======================================================================86 Full 249,1986.704 G OK 01-Dec-1303:20 ops.jokefire.com
90 Full 1377.4 M OK 01-Dec-1309:33 Jokefire_BackupCatalog
93 Incr 00 Error 02-Dec-1312:54 ops.jokefire.com
1 Full 249,2656.711 G OK 03-Dec-1303:22 ops.jokefire.com
5 Full 1170.1 M OK 03-Dec-1315:45 Jokefire_BackupCatalog
6 Incr 18,175847.8 M OK 04-Dec-1303:32 ops.jokefire.com
10 Full 1197.0 M OK 04-Dec-1305:48 Jokefire_BackupCatalog
11 Incr 1,127728.2 M OK 05-Dec-1303:08 ops.jokefire.com
15 Full 1215.5 M OK 05-Dec-1303:47 Jokefire_BackupCatalog
19121.497 K OK 06-Dec-1321:55 RestoreFiles
====
It does seem at this point, however, that my celebration was a bit premature.
What I've done is scale down my normal backups to just the localhost on which bacula is running. Once I am able to take a full backup and perform a restore I will consider it a success. I should not have run a victory lap short of achieving this.
Because the next backup I tried to run produced this result:
So dear friends, I was hoping to run my configs by you one more time (hopefully the last) in an attempt to troubleshoot this problem.
These are my cert files:
-r--------1 bacula bacula 2.2K Dec 521:20/etc/pki/CA/certs/ca.crt
-r--------1 bacula bacula 1.9K Dec 521:20/etc/pki/tls/certs/ops.jokefire.com.crt
-r--------1 bacula bacula 3.2K Dec 521:20/etc/pki/tls/private/ops.jokefire.com.key This is the state my configs were in during my last attempt. I have not yet reverted to the working configs. bacula-dir.conf
Director { # define myself
Name = ops.jokefire.com
DIRport =9101 # where we listen for UA connections
QueryFile ="/etc/bacula/query.sql"
WorkingDirectory ="/var/spool/bacula"
PidDirectory ="/var/run"
Maximum Concurrent Jobs =1
Password ="secret" # Console password
Messages = Daemon
TLS Certificate =/etc/pki/tls/certs/ops.jokefire.com.crt
TLS Key =/etc/pki/tls/private/ops.jokefire.com.key
TLS CA Certificate File =/etc/pki/CA/certs/ca.crt
TLS Enable = yes
TLS Require = yes
}
# Client (File Services) to backup
Client {
Name = ops.jokefire.com
Address = ops.jokefire.com
FDPort =9102
Catalog = JokefireCatalog
Password ="secret" # password for FileDaemon
File Retention =14 days # 14 days
Job Retention = 14d # 14 days
AutoPrune = yes # Prune expired Jobs/Files
TLS Certificate =/etc/pki/tls/certs/ops.jokefire.com.crt
TLS Key =/etc/pki/tls/private/ops.jokefire.com.key
TLS CA Certificate File =/etc/pki/CA/certs/ca.crt
TLS Enable = yes
TLS Require = yes
}
Storage {
Name = File
# Do not use "localhost" here
Address = ops.jokefire.com # N.B. Use a fully qualified name here
SDPort =9103
Password ="secret"
Device = FileStorage
Media Type = File
}
Director {
Name = ops.jokefire.com
Password ="secret"
TLS Certificate =/etc/pki/tls/certs/ops.jokefire.com.crt
TLS Key =/etc/pki/tls/private/ops.jokefire.com.key
TLS CA Certificate File =/etc/pki/CA/certs/ca.crt
TLS Enable = yes
TLS Require = yes
}
FileDaemon { # this is me
Name = ops.jokefire.com
FDport =9102 # where we listen for the director
WorkingDirectory =/var/bacula
Pid Directory =/var/run
Maximum Concurrent Jobs =20
TLS Certificate =/etc/pki/tls/certs/ops.jokefire.com.crt
TLS Key =/etc/pki/tls/private/ops.jokefire.com.key
TLS CA Certificate File =/etc/pki/CA/certs/ca.crt
TLS Enable = yes
TLS Require = yes
}
bacula-sd.conf
Storage { # definition of myself
Name = ops.jokefire.com
SDPort =9103 # Director's port
WorkingDirectory ="/var/spool/bacula"
Pid Directory ="/var/run"
Maximum Concurrent Jobs =20
TLS Certificate =/etc/pki/tls/certs/ops.jokefire.com.crt
TLS Key =/etc/pki/tls/private/ops.jokefire.com.key
TLS CA Certificate File =/etc/pki/CA/certs/ca.crt
TLS Enable = yes
TLS Require = yes
}
Director {
Name = ops.jokefire.com
Password ="secret"
TLS Certificate =/etc/pki/tls/certs/ops.jokefire.com.crt
TLS Key =/etc/pki/tls/private/ops.jokefire.com.key
TLS CA Certificate File =/etc/pki/CA/certs/ca.crt
TLS Enable = yes
TLS Require = yes
#Monitor = yes
}
I´m going to put here all that I did and maybe you will find an answer for your problem. The whole thing is about certificates. I think two major problems are: removing the password from the director1 private key and have a CA to sign the director1 certificate (don´t use a self signed certificate).
Subject: C=US, ST=NJ, L=Newark, O=Jokefire LLC, OU=Ops, CN=ops.jokefire.com CA
I just added the CA after the common name for the CA cert in order to prevent a naming collision as per this advice that I found in that other article I followed in how to name the CN.
"The Common Name (CN) of the CA and the Server certificates must NOT match or else a naming collision will occur and you'll get errors later on. In this step, you'll provide the CA entries. In a step below, you'll provide the Server entries. In this example, I just added "CA" to the CA's CN field, to distinguish it from the Server's CN field. Use whatever schema you want, just make sure the CA and Server entries are not identical."
And I renamed all references to director in all the configs to refer instead to 'ops.jokefire.com'
Thanks for pointing me to that thread. The guys problem was very similar to my own. But ultimately no such luck after following the advice there sad to say.
The part that I keyed onto was where he said this:
Thank you. After a while I figured out how to do this. Furthermore I
had "nsCertType = server" in my caconfig.cnf and commented it. Now I
see...
That was on an Ubuntu machine. I'm on a CentOS 5.9 host and on my setup the file was openssl.cnf. I set the recommended settings there and regenerated the keys.
[root@storage:/etc/bacula] #grep -i nscerttype /etc/openvpn/easy-rsa/1.0/openssl.cnf
# Here are some examples of the usage of nsCertType. If it is omitted
# nsCertType = server
# nsCertType = objsign
# nsCertType = client, email
# nsCertType = client, email, objsign
# JY ADDED -- Make a cert with nsCertType set to "server"nsCertType = server
# nsCertType = sslCA, emailCA
Here are the certs I've created for this go-around (and unfortunately I feel likeI'm spinning in circles)
## CA Cert / Key
-r-------- 1 root root 2216 Nov 29 18:08 /etc/pki/CA/certs/ca.crt
-r-------- 1 root root 3243 Nov 29 18:08 /etc/pki/CA/private/ca.key
## Server Cert /Key
-r-------- 1 root root 1903 Nov 29 18:23 /etc/pki/tls/certs/ops.jokefire.com.crt
-r-------- 1 root root 3243 Nov 29 18:23 /etc/pki/tls/private/ops.jokefire.com.key
The guide that I used tocreate the keys for this attempt can be found here:
The Common Name (CN) of the CA and the Server certificates must NOT match or else a naming collision will occur and you'll get errors later on. In this step, you'll provide the CA entries. In a step below, you'll provide the Server entries. In this example, I just added "CA" to the CA's CN field, to distinguish it from the Server's CN field. Use whatever schema you want, just make sure the CA and Server entries are not identical.
So I created the certs with differing hostnames for the CN section in the root CA cert and the sever certificate:
Both of which are in the hosts file and pointing to the internal IP of the EC2 instance.
And here was the config for this attempt:
bacula-dir.conf
## Bacula Dir config
Director { # define myself
Name=storage.jokefire.com
DIRport =9101 # where we listen for UA connections
QueryFile ="/etc/bacula/query.sql"
WorkingDirectory ="/var/spool/bacula"
PidDirectory ="/var/run"
Maximum Concurrent Jobs =1
Password ="secret" # Console password
Messages =
TLS Certificate = /etc/pki/tls/certs/ops.jokefire.com.crt
TLS Key= /etc/pki/tls/private/ops.jokefire.com.key
TLS CA Certificate File= /etc/pki/CA/certs/ca.crt
TLS Enable= yes
TLS Require = yes
}
FileDaemon { # this is me
Name=storage.jokefire.com
FDport =9102 # where we listen for the director
WorkingDirectory = /var/bacula
Pid Directory = /var/run
Maximum Concurrent Jobs =20
[root@storage:/etc/bacula] #bconsole
Connecting to Director storage.jokefire.com:9101
TLS negotiation failed
Director authorization problem.
Most likely the passwords do not agree.
If you are using TLS, there may have been a certificate validation error during the TLS handshake.
Please see http://www.bacula.org/en/rel-manual/Bacula_Freque_Asked_Ques
ti.html#SECTION00260000000000000000 for help.
Less verbose error than last time! So I feel that I may be getting closer. :)
Nothing turns up in the bacula log for some reason when I attempt. Oh well.
Next I tried commenting out tls options on just FD and SD to see if I could get DIR and Console to communicate via TLS.
Same EXACT outcome.
[root@storage:/etc/bacula] #bconsole
Connecting to Director storage.jokefire.com:9101
TLS negotiation failed
Director authorization problem.
Most likely the passwords do not agree.
If you are using TLS, there may have been a certificate validation error during the TLS handshake.
Please see http://www.bacula.org/en/rel-manual/Bacula_Freque_Asked_Questi.html#SECTION00260000000000000000 for help.
You guys have been great in responding and very patient. I hope this problem isn't wearing as thin on your nerves at this point as it is on mine! lol
You are having problem in TLS communication between bconsole and director.
I suggest you to remove all the other TLS configuration (client,
storage) and try to resolve this one first. When I tried this
configuration, I remember doing that: TLS between director and bconsole,
TLS between director and client, and so on.
Ok, well I took your advice and commented out the TLS configuration in the client section of bacula-dir, and commented it out entirely of the bacula-sd and bacula-fd configuration files. After bouncing the services again and going into bconsole I get the same error:
[root@storage:/etc/bacula] #bconsole
Connecting to Director storage.jokefire.com:9101
29-Nov 15:06 bconsole JobId 0: Error: tls.c:92 Error with certificate at depth: 0, issuer =/C=US/ST=NJ/L=Newark/O=Jokefire LLC/OU=Ops/CN=storage.jokefire.com, subject =/C=US/ST=NJ/L=Newark/O=Jokefire LLC/OU=Ops/CN=storage.jokefire.com, ERR=18:self signed certificate
TLS negotiation failed
Director authorization problem.
Most likely the passwords do not agree.
If you are using TLS, there may have been a certificate validation error during the TLS handshake.
Please see http://www.bacula.org/en/rel-manual/Bacula_Freque_Asked_Questi.html#SECTION00260000000000000000 for help.
I don´t know if this could be an issue, but your certificate have OU issuer different from OU subject:
I'm actually not obscuring the rest of the cert data this time around. So you can see that the apparent disparity to which you refer was actually a mistake on my part in obscuring the data. However I don't see anything too threatening in revealing the info here.
Looks like it agrees to me! So there shouldn't be a disparity of this nature causing the error I assume.
And in your bacula-sd.conf, also remove or set it to no: "TLS Verify Peer = yes".
I did try a bounce with this change in place, and it made no difference here either. I got the same exact error.
I do not know which is you bacula version, but in the
bconsole configuration file , i have the address value pointing to
"directors machine name":
I do not know how to check the bacula version other than that of bconsole which is:
Version:5.2.13(19 February 2013) x86_64-unknown-linux-gnu redhat
And I don't see any disparity between the director listedin the bacula-dir file and in the bconsole
bacula-dir.conf
Director {# define myself
Name = storage.jokefire.
com
bconsole.conf
Director {
Name = storage.jokefire.com
Really i do not see any other problem.
Interesting to know!
Have you check the firewall??
Well, on my first attempt I am merely trying to backup only the localhost. I know that there are two different names listed here (storage.jokefire.com and ops.jokefire.com) but these are merely two different DNS names for the same host. So the firewall shouldn't come into play here. Plus the fact that this is an EC2 host and I mange the firewall with AWS Security Groups and leave IPTables turned off.
But I wonder if that could also be another problem? Tho I don't see it being part of the problem I'm having with getting bacula to agree with it's own TLS configuration.
I really hope that the problem we're having here isn't centered around my using self-signed certs. I'd hate to shell out for a commercial one, especially as I consider the commercial cert business to be sort of a scam.
You are having problem in TLS communication between bconsole and director.
I suggest you to remove all the other TLS configuration (client, storage) and try to resolve this one first. When I tried this configuration, I remember doing that: TLS between director and bconsole, TLS between director and client, and so on.
I don´t know if this could be an issue, but your certificate have OU issuer different from OU subject:
And the permissions on the cert files appears to be correct:
-rw-r--r--1 bacula bacula 1521 Nov 2813:53/etc/pki/CA/certs/rootBaculaCA.pem
-rw-r--r--1 bacula bacula 1224 Nov 2813:54/etc/pki/tls/certs/storage.jokefire.com.crt
-rw-r--r--1 bacula bacula 1675 Nov 2813:54/etc/pki/tls/private/storage.jokefire.com.key
And the services bounce without any complaint:
[root@storage:~] #bounce-bacula
Stopping Bacula Storage services: [ OK ]
Starting Bacula Storage services: [ OK ]
Stopping Bacula File services: [ OK ]
Starting Bacula File services: [ OK ]
Stopping Bacula Director services: [ OK ]
Starting Bacula Director services: [ OK ]
Yet the same error as before is produced:
[root@storage:~] #bconsole
Connecting to Director storage.jokefire.com:910129-Nov 13:08 bconsole JobId 0: Error: tls.c:92 Error with certificate at depth: 0, issuer =/C=US/ST=XX/L=XX/O=XX/OU=XXX/CN=storage.jokefire.com, subject =/C=US/ST=XX/L=XX/O=XX/OU=XX/CN=storage.jokefire.com, ERR=18:self signed certificate
TLS negotiation failed
Director authorization problem.
Most likely the passwords do not agree.
If you are using TLS, there may have been a certificate validation error during the TLS handshake.
Please see http://www.bacula.org/en/rel-manual/Bacula_Freque_Asked_Questi.html#SECTION00260000000000000000 for help.
And I see that the subject line from the cert agrees with the error that I'm seeing in Bacula.
Looking forward to coming to some sort of resolution with this, it's been days and days that I've been working on it. And I certainly appreciate everyone's help and input.
Verify peer certificate. Instructs server to request and verify the client's x509 certificate. Any client certificate signed by a known-CA will be accepted unless the TLS Allowed CN configuration directive is used, in which case the client certificate must correspond to the Allowed Common Name specified. This directive is valid only for a server and not in a client context.
bacula-sd.conf
Storage { # definition of myself
...
# Peer certificate is not required/requested -- peer validity
# is verified by the storage connection cookie provided to the
# File Daemon by the director.
TLS Verify Peer = no
...
}
A time ago I configured a test environment with TLS and I remember using "TLS Verify Peer = no" because of the self-signed certificates.
I think you can use "TLS Verify Peer = yes" combined with:
TLS Allowed CN = <string list>
Common name attribute of allowed peer certificates. If this directive is specified, all server certificates will be verified against this list. This can be used to ensure that only the CA-approved Director may connect. This directive may be specified more than once.
However when I go into bconsole this is what I find:
[root@storage:~/bacula-certs-new] #bconsole
Connecting to Director storage.jokefire.com:9101
28-Nov 14:04 bconsole JobId 0: Error: tls.c:92 Error with certificate at depth: 0, issuer = /C=US/ST=XX/L=XX/O=XX/OU=XX/CN=storage.jokefire.com, subject = /C=US/ST=XX/L=XX/O=XX/OU=XX/CN=storage.jokefire.com, ERR=18:self signed certificate
TLS negotiation failed Director authorization problem. Most likely the passwords do not agree. If you are using TLS, there may have been a certificate validation error during the TLS handshake. Please see http://www.bacula.org/en/rel-manual/Bacula_Freque_Asked_Questi.html#SECTION00260000000000000000 for help.
Passwords have not been changed from the working configs, which have been in place and working for several months now.
Any further thoughts?
Many thanks and I hope you are enjoying your holiday!
I've saved my work with TLS so I'm eager to get this going. I used the following guide to generating the certs, and I'm wondering if the problem could possibly be in the way I generated the certs?
-- #################################### Iban Cabrillo Bartolome Instituto de Fisica de Cantabria (IFCA) Santander, Spain
Tel: +34942200969 #################################### Bertrand Russell: "El problema con el mundo es que los estúpidos están seguros de todo y los inteligentes están llenos de dudas"
-- #################################### Iban Cabrillo Bartolome Instituto de Fisica de Cantabria (IFCA) Santander, Spain
Tel: +34942200969
#################################### Bertrand Russell: "El problema con el mundo es que los estúpidos están seguros de todo y los inteligentes están llenos de dudas"
-- #################################### Iban Cabrillo Bartolome Instituto de Fisica de Cantabria (IFCA) Santander, Spain
Tel: +34942200969 #################################### Bertrand Russell: "El problema con el mundo es que los estúpidos están seguros de todo y los inteligentes están llenos de dudas"