项目作者: vacovsky

项目描述 :
Control health checks and toggle upstream node status in load balancers with ease.
高级语言: Go
项目地址: git://github.com/vacovsky/poolse.git
创建时间: 2017-03-08T18:44:41Z
项目社区:https://github.com/vacovsky/poolse

开源协议:Other

下载


Poolse

Demo at poolse.vacovsky.us.

Features

  • Exposes a simple interface to “up” and “down” upstream nodes.

  • Monitoring of targets can be as simple or complex as needed. HTTP Status code checking, reponse strings parsing, and fail/success thresholds are supported for determining health of upstream nodes.

  • Multiple status formats are provided, including: 503, no repsonse, as well as partial and full JSON blobs describing the state and parameters per target.

  • Reconfigure/reload of settings on the fly, without “downing” the node (this is handy when using automation platforms like Chef).

  • Provides a common format/interface for managing applications which are behind some form of reverse proxy or load balancer that checks at interval for server health status.

Building

  1. go get -u github.com/davecgh/go-spew/spew
  2. git clone https://github.com/vacoj/poolse.git
  3. cd poolse/src/poolse
  4. go build

Configuration file

  • To start with a specific configuration file, just execute like this
  1. ./poolse /path/to/config.json
  1. {
  2. "state": {
  3. "startup_state": true, // if true, when server starts, if first check passes server state is marked OK
  4. "administrative_state": "AdminOff", // If persistent state isn't on, this is the default statup state for the STATUS. If will only be OK if all Targets are also OK on first check
  5. "persist_state": true // indicates whether or not STATUS.State.AdministrativeState should be sticky between settings/application restarts and reloads.
  6. },
  7. "targets": [
  8. {
  9. "endpoint": "http://localhost:5704/fakehealth", // url to your application's health endpoint
  10. "polling_interval": 15, // polling interval for target endpoint, in seconds
  11. "expected_status_code": 200, // *required* HTTP status code to look for. If this isn't returned when the check happens, we mark OK as false.
  12. "up_count_threshold": 10, // this many failed checks will mark target as online
  13. "down_count_threshold": 10 // this many failed checks will mark target as offline
  14. },
  15. {
  16. "name": "Expected Example",
  17. "endpoint": "http://localhost:5704/fakeexpected", // url to your application's health endpoint
  18. "polling_interval": 20, // polling interval for target endpoint, in seconds
  19. "expected_status_code": 200, // HTTP status code to look for. If this isn't returned when the check happens, we mark OK as false.
  20. "expected_response_strings": ["{\"is_working\": true}"],
  21. "up_count_threshold": 10, // this many failed checks will mark target as online
  22. "down_count_threshold": 10 // this many failed checks will mark target as offline
  23. },
  24. {
  25. "name": "Unexpected Example",
  26. "endpoint": "http://localhost:5704/fakeexpected", // url to your application's health endpoint
  27. "polling_interval": 10, // polling interval for target endpoint, in seconds
  28. "expected_status_code": 200, // HTTP status code to look for. If this isn't returned when the check happens, we mark OK as false.
  29. "unexpected_response_strings": ["{\"is_working\": false}"], // response is parsed for this string. If unexpected_response_string is blank, check is ignored. If found, OK is false (an example would be searching repsonse text for {"thisthing": false}, and if found, causes OK to be set to false)
  30. "up_count_threshold": 10, // this many failed checks will mark target as online
  31. "down_count_threshold": 10 // this many failed checks will mark target as offline
  32. },
  33. {
  34. "name": "Fake Smoke" // Arbitrary - use for your own reasons, or leave it blank.
  35. "endpoint": "http://localhost:5704/fakesmoke", // url to your application's health endpoint
  36. "polling_interval": 300, // polling interval for target endpoint, in seconds
  37. "expected_status_code": 200, // *required* HTTP status code to look for. If this isn't returned when the check happens, we mark OK as false.
  38. }
  39. ],
  40. "service": {
  41. "http_port": "5704", // *string not int; port to listen on for incoming web requests
  42. "debug": false, // displays certain pieces of data in console if true
  43. "show_http_log": true, // shows log in console of calls being made if true
  44. "state_file_name": "state.dat", // name of state file. defaults to state.dat if not preset
  45. "follow_redirects": false // whether or not to follow redirects when querying targets
  46. }
  47. }

Application Controls / API

Dashboard

“/“

  • Displays simple dashboard of targets and status to caller.
  • Requires some setup:
  1. cd poolse/src/poolse/static
  2. bower install

Status Endpoints

“/status”

  • Shows long-form status to the caller
  • All requests to endpoints stemming from “/status” will accept a query string param of id (example: /status/simple?id=1) to get just the target object at that index; id should correspond to index of target in config file.
  1. {
  2. "State": {
  3. "ok": true,
  4. "startup_state": true,
  5. "persist_state": true,
  6. "administrative_state": ""
  7. },
  8. "Targets": [
  9. {
  10. "id": 0,
  11. "name": "",
  12. "endpoint": "http://localhost",
  13. "polling_interval": 5,
  14. "expected_status_code": 200,
  15. "expected_response_strings": null,
  16. "unexpected_response_strings": null,
  17. "last_ok": "2017-03-23T08:30:51.1019184-07:00",
  18. "last_checked": "2017-03-23T08:30:51.1019184-07:00",
  19. "ok": true,
  20. "up_count": 6987,
  21. "up_count_threshold": 10,
  22. "down_count": 0,
  23. "down_count_threshold": 1
  24. }
  25. ],
  26. "Version": "0.3.10"
  27. }

“/status/simple”

  • Returns 200 if they all return true.
  • Returns NO RESPONSE error if any of the Targets (targets[i].ok) are false, or if State is false. This is for F5’s health monitor.

“/status/simple2”

  • Returns 200 status when “State”, and all of the Targets (targets[i].ok) are true.
  • Returns NO RESPONSE error if any of the Targets (targets[i].ok) are false, or if State is false. This is for F5’s health monitor.

Toggle Endpoints

“/toggle”

  • If “HealthStatus.OK” and “SmokeStatus.OK” are true, sets “State” to true if “!State”.
  • If “State” is true, then sets “State” to false.

“/toggle/on”

  • If “HealthStatus.OK” and “SmokeStatus.OK” are true, sets “State” to true.

“/toggle/off”

  • Sets “State” to false

“/toggle/adminoff”

  • Prevents happy status no matter what else is “OK”

“/toggle/adminon”

  • Prevents unsuccessful response no matter what else is not “OK”

“/toggle/adminreset”

  • Clears state.dat and resets AdministrativeState to empty

Settings Endpoints

“/settings”

  • Displays currently loaded Settings struct, as populated from the config file.

“/settings/reload”

  • Reloads configuration file into memory and restarts all target monitors contained within. A configuration reload generally takes as much time as the longest polling interval present in the targets list, plus 5 seconds.

Testing Endpoints

“/fakesmoke”, “fakeexpected”, and “/fakehealth”

  • Fake endpoints that return 200, maybe some context, and are the defaults in the config file. This is just for POC and testing, but if you don’t have a smoke endpoint and do have a health endpoint, leave it to what it’s already set to in the config file, or delete the targets from the config.