Checking the 3PAR Quorum Witness appliance
Two 3PAR StoreServs running in a Peer Persistence setup lost the connection to the Quorum Witness appliance. The appliance is an important part of a 3PAR Peer Persistence setup, because it acts as a tie-breaker in a split-brain scenario.
While analyzing this issue, I saw this message in the 3PAR Management Console:
In addition to that, the customer got e-mails that the 3PAR StoreServ arrays lost the connection to the Quorum Witness appliance. In my case, the CouchDB process died. A restart of the appliance brought it back online.
How to check the Quorum Witness appliance?
You can check the status of the appliance with a simple web request. The documentation shows a simple test based on curl. You can run this direct from the BASH of the appliance.
[root@linuxvm ~]# curl http://10.0.0.99:8080
{"couchdb":"Welcome","version":"1.0.4"}
[root@linuxvm ~]#
But you can also use the PowerShell cmdlet Invoke-WebRequest.
PS C\:User\spatrick> Invoke-WebRequest -Uri http://10.0.0.99:8080
StatusCode : 200
StatusDescription : OK
Content : {"couchdb":"Welcome","version":"1.0.4"}
RawContent : HTTP/1.1 200 OK
Content-Length: 40
Cache-Control: must-revalidate
Content-Type: text/plain;charset=utf-8
Date: Mon, 30 Jan 2017 08:31:37 GMT
Server: CouchDB/1.0.4 (Erlang OTP/R14B04)
{"couchdb...
Forms : {}
Headers : {[Content-Length, 40], [Cache-Control, must-revalidate], [Content-Type, text/plain;charset=utf-8],
[Date, Mon, 30 Jan 2017 08:31:37 GMT]...}
Images : {}
InputFields : {}
Links : {}
ParsedHtml : mshtml.HTMLDocumentClass
RawContentLength : 40
If you add /witness to the URL, you can test the access to the database, which is used for Peer Persistence.
PS C:\Users\patrick> Invoke-WebRequest -Uri http://10.0.0.99:8080/witness
StatusCode : 200
StatusDescription : OK
Content : {"db_name":"witness","doc_count":5,"doc_del_count":4,"update_seq":149557915,"purge_seq":0,"compact_
running":false,"disk_size":48988254,"instance_start_time":"1485763322826940","disk_format_version":
5,...
RawContent : HTTP/1.1 200 OK
Content-Length: 234
Cache-Control: must-revalidate
Content-Type: text/plain;charset=utf-8
Date: Mon, 30 Jan 2017 08:36:38 GMT
Server: CouchDB/1.0.4 (Erlang OTP/R14B04)
{"db_nam...
Forms : {}
Headers : {[Content-Length, 234], [Cache-Control, must-revalidate], [Content-Type,
text/plain;charset=utf-8], [Date, Mon, 30 Jan 2017 08:36:38 GMT]...}
Images : {}
InputFields : {}
Links : {}
ParsedHtml : mshtml.HTMLDocumentClass
RawContentLength : 234
If you get a connection error, check if the beam process is running.
[root@mlinuxvm ~]# netstat -tulpen |grep 8080
tcp 0 0 0.0.0.0:8080 0.0.0.0:* LISTEN 495 10726 1643/beam
[root@linuxvm ~]#
If not, reboot the appliance. This can be done without downtime. The appliance comes only into play, if a failover occurs.