A. Public facing practices:
- *1. Are user requests tracked via a ticket system?
- *2. Are "the 3 empowering policies" defined and published?
- 3. Does the team record monthly metrics?
B. Modern team practices:
- *4. Do you have a "policy and procedure" wiki?
- 5. Do you have a password safe?
- 6. Is your team's code kept in a source code control system?
- 7. Does your team use a bug-tracking system for their own code?
- 8. In your bugs/tickets, does stability have a higher priority than new features?
- 9. Does your team write "design docs"?
- 10. Do you have a "post-mortem" process?
C. Operational practices:
- *11. Does each service have an OpsDoc?
- *12. Does each service have appropriate monitoring?
- 13. Do you have a pager rotation schedule?
- 14. Do you have separate development, QA, and production systems?
- 15. Do roll-outs to many machines have a "canary process"?
D. Automation practices:
- 16. Do you use configuration management tools like cfengine/puppet/chef?
- 17. Do automated administration tasks run under role accounts?
- 18. Do automated processes that generate email only do so when they have something to say?
E. Fleet management practices:
- *19. Is there a database of all machines?
- 20. Is OS installation automated?
- *21. Can you automatically patch software across your entire fleet?
- 22. Do you have a hardware refresh policy?
F. "We acknowledge that hardware breaks" practices:
- *23. Can your servers keep operating even if 1 disk dies?
- 24. Is the network core N+1?
- *25. Are your backups automated?
- *26. Are your disaster recovery plans tested periodically?
- 27. Do machines in your data center have remote power / console access?
G. Security practices: