项目作者: JuliaPOMDP

项目描述 :
Concise and friendly interfaces for defining MDP and POMDP models for use with POMDPs.jl solvers
高级语言: Julia
项目地址: git://github.com/JuliaPOMDP/QuickPOMDPs.jl.git
创建时间: 2017-12-16T08:21:36Z
项目社区:https://github.com/JuliaPOMDP/QuickPOMDPs.jl

开源协议:Other

下载


QuickPOMDPs

Build Status
codecov
Docs - Stable

Simplified interfaces for specifying POMDPs.jl models.

The package contains two interfaces - the Quick interface, and the Discrete Explicit interface.

Please see the documentation for more information on each.

The package can also be used from Python via pyjulia. See examples/tiger.py for an example.

Quick Interface

The Quick Interface exposes nearly all of the features of POMDPs.jl as constructor keyword arguments. Documentation, Mountain Car Example:

  1. mountaincar = QuickMDP(
  2. gen = function (s, a, rng)
  3. x, v = s
  4. vp = clamp(v + a*0.001 + cos(3*x)*-0.0025, -0.07, 0.07)
  5. xp = x + vp
  6. if xp > 0.5
  7. r = 100.0
  8. else
  9. r = -1.0
  10. end
  11. return (sp=(xp, vp), r=r)
  12. end,
  13. actions = [-1., 0., 1.],
  14. initialstate = Deterministic((-0.5, 0.0)),
  15. discount = 0.95,
  16. isterminal = s -> s[1] > 0.5
  17. )

Tiger POMDP Example:

  1. tiger = QuickPOMDP(
  2. states = ["left", "right"],
  3. actions = ["left", "right", "listen"],
  4. observations = ["left", "right"],
  5. initialstate = Uniform(["left", "right"]),
  6. discount = 0.95,
  7. transition = function (s, a)
  8. if a == "listen"
  9. return Deterministic(s) # tiger stays behind the same door
  10. else # a door is opened
  11. return Uniform(["left", "right"]) # reset
  12. end
  13. end,
  14. observation = function (s, a, sp)
  15. if a == "listen"
  16. if sp == "left"
  17. return SparseCat(["left", "right"], [0.85, 0.15]) # sparse categorical distribution
  18. else
  19. return SparseCat(["right", "left"], [0.85, 0.15])
  20. end
  21. else
  22. return Uniform(["left", "right"])
  23. end
  24. end,
  25. reward = function (s, a)
  26. if a == "listen"
  27. return -1.0
  28. elseif s == a # the tiger was found
  29. return -100.0
  30. else # the tiger was escaped
  31. return 10.0
  32. end
  33. end
  34. )

Discrete Explicit Interface

The Discrete Explicit Interface is an older, less powerful interface suitable for problems with small discrete state, action, and observation spaces. Though it is less powerful, the interface may be pedagogically useful because each element of the (S, A, O, R, T, Z, γ) tuple for a POMDP and (S, A, R, T, γ) tuple for an MDP is defined explicitly in a straightforward manner. Documentation, Tiger POMDP Example:

  1. S = [:left, :right]
  2. A = [:left, :right, :listen]
  3. O = [:left, :right]
  4. γ = 0.95
  5. function T(s, a, sp)
  6. if a == :listen
  7. return s == sp
  8. else # a door is opened
  9. return 0.5 #reset
  10. end
  11. end
  12. function Z(a, sp, o)
  13. if a == :listen
  14. if o == sp
  15. return 0.85
  16. else
  17. return 0.15
  18. end
  19. else
  20. return 0.5
  21. end
  22. end
  23. function R(s, a)
  24. if a == :listen
  25. return -1.0
  26. elseif s == a # the tiger was found
  27. return -100.0
  28. else # the tiger was escaped
  29. return 10.0
  30. end
  31. end
  32. m = DiscreteExplicitPOMDP(S,A,O,T,Z,R,γ)