##############################################################################
Random Simple Things that have Worked in the Past and are Likely to Work Again
##############################################################################

A Sunfire Goes Down
-------------------
As of 6/27/2016.

Note: one of the sunfires can't be booted with all the disks in. If this sunfire
fails, pop all the disks except for the SSD (which is a pain to get back in),
and the boot drives (drives 1 and 2). Then boot the machine. Then, after it has
booted, put the disks back in, and follow the instructions below.

1. ssh onto magellan, and run ``bmc sunfire0-bmc chassis power cycle``.
2. ssh onto the sunfire, probably through magellan. Run ``zfs mount -a`` if the
   zpools aren't already mounted, and then start ceph (either through sys v init
   or systemd, depending on the sunfire).
3. monitor ``ceph health`` (on any ceph monitor: crimea, magellan, or gomes) to
   ensure that ceph comes back up properly.


The Website is Reachable, but everything 403's or 404's
-------------------------------------------------------
As of 7/6/2016.

Restart web.vm.

Mail server fails IMAP requests
-------------------------------
As of 7/21/2016.

Run ``sudo journalctl -u dovecot`` on crimea.acm.jhu.edu.

If it says that a connection timed out to acmsys/Maildir, then there's a problem
with the afs mail dir servers on chicago.

First things first, check the ZFS status. Run ``zpoll status``. If that says
something is wrong, debug the zpool.

To restart the maildir server run ``/etc/init.d/openafs-fileserver restart``. If
takes longer than ~10 minutes something else is wrong. Try restarting chicago.

Echidna's AFS servers died
--------------------------
As of 9/19/2016.

Reboot echidna.

You can't do ceph things with cinder (like create/delete volumes)
-----------------------------------------------------------------
As of 9/25/16.

Restart cinder-volume on gomes.

Ceph won't start on a sunfire due to permission errors
----------------------------------------------------------------
As of 2/28/17.

Run ``chown -R ceph:ceph /var/run/ceph``, then try again.

See http://tracker.ceph.com/issues/15553 for more info.


A ceph mon is down after a restart
----------------------------------
As of 3/11/2017.

Run ``systemctl restart ceph``. The issue is that, since our ceph config is
served out of AFS, we have an implicit dependency on AFS, but systemd doesn't
know it (this should be fixed at some point). Anyway, by the time you ssh into
the machine to manually restart ceph, openafs-client should be up, so simply
restarting ceph should just work.

OpenStack VMs won't be deleted, and they just hang
--------------------------------------------------
As of 3/11/2017.

Reboot gomes.


You Can't Delete OpenStack VMs (they're stuck in the deleting state)
--------------------------------------------------------------------
As of 4/12/2017.

ssh to the compute node that the instance was running on, and restart the nova daemon on it.