Getting Started
This guide covers the basics: creating environments, running the step loop, understanding spaces, and using the built-in environments.
Installation
opam install fehu
Or build from source:
git clone https://github.com/raven-ml/raven
cd raven && dune build fehu
Creating an Environment
Environments are created via factory functions in Fehu_envs. Randomness is
provided by the implicit RNG scope from Nx.Rng.run:
open Fehu
let () = Nx.Rng.run ~seed:42 @@ fun () ->
let env = Fehu_envs.Cartpole.make () in
ignore env
The seed controls all randomness in the scope. Use the same seed to get the same episode sequence.
The Step Loop
An environment follows a strict lifecycle: reset must be called before the
first step, and again after any terminal step (terminated or truncated).
open Fehu
let () = Nx.Rng.run ~seed:42 @@ fun () ->
let env = Fehu_envs.Cartpole.make () in
(* Reset returns the initial observation and info *)
let _obs, _info = Env.reset env () in
(* Step returns observation, reward, terminated, truncated, info *)
let s = Env.step env (Space.Discrete.of_int 0) in
Printf.printf "reward: %.1f, terminated: %b, truncated: %b\n"
s.reward s.terminated s.truncated
A complete episode loop:
open Fehu
let run_episode env =
let _obs, _info = Env.reset env () in
let done_ = ref false in
let total_reward = ref 0.0 in
while not !done_ do
let act = Space.sample (Env.action_space env) in
let s = Env.step env act in
total_reward := !total_reward +. s.reward;
done_ := s.terminated || s.truncated
done;
!total_reward
let () = Nx.Rng.run ~seed:42 @@ fun () ->
let env = Fehu_envs.Cartpole.make () in
let _reward = run_episode env in ()
Spaces
Spaces define the valid observations and actions for an environment. They provide sampling, validation, and serialization.
Discrete
Integer choices. Used for environments with a finite number of actions (e.g., left/right).
open Fehu
let space = Space.Discrete.create 4 (* actions 0, 1, 2, 3 *)
let _n = Space.Discrete.n space (* 4 *)
(* Sample a random action (requires an Nx.Rng scope) *)
let _act = Nx.Rng.run ~seed:0 @@ fun () ->
Space.sample space
(* Convert between int and discrete element *)
let act = Space.Discrete.of_int 2
let _i = Space.Discrete.to_int act (* 2 *)
(* Check membership *)
let _valid = Space.contains space act (* true *)
Box
Continuous vectors with per-dimension bounds. Used for continuous observations (e.g., position, velocity) and continuous actions.
open Fehu
let space = Space.Box.create
~low:[| -1.0; -2.0 |]
~high:[| 1.0; 2.0 |]
let _low, _high = Space.Box.bounds space
let _obs = Nx.Rng.run ~seed:0 @@ fun () -> Space.sample space
Other Space Types
- Multi_binary: binary vectors of fixed length (multi-label scenarios)
- Multi_discrete: multiple discrete axes with independent cardinalities
- Tuple: fixed-length heterogeneous sequences
- Dict: named fields with different space types
- Sequence: variable-length homogeneous sequences
- Text: character strings from a fixed alphabet
All spaces support contains, sample, pack/unpack (to/from the
universal Value.t type), and boundary_values.
Available Environments
CartPole
Classic cart-pole balancing. Push a cart left or right to keep a pole upright. Reward is +1.0 per step. Terminates when the pole exceeds +/-12 degrees or the cart leaves +/-2.4. Truncates at 500 steps.
- Observation: Box [4] -- x, x_dot, theta, theta_dot
- Actions: Discrete 2 -- 0 = push left, 1 = push right
let env = Fehu_envs.Cartpole.make ()
MountainCar
A car in a valley must build momentum to climb a hill. Reward is -1.0 per step. Terminates when position >= 0.5 with non-negative velocity. Truncates at 200 steps.
- Observation: Box [2] -- position, velocity
- Actions: Discrete 3 -- 0 = push left, 1 = coast, 2 = push right
let env = Fehu_envs.Mountain_car.make ()
GridWorld
5x5 grid navigation with an obstacle. Agent starts at (0,0), goal at (4,4), obstacle at (2,2). Reward is +10.0 at goal, -1.0 otherwise. Truncates at 200 steps.
- Observation: Multi_discrete [5; 5] -- (row, col)
- Actions: Discrete 4 -- 0 = up, 1 = down, 2 = left, 3 = right
let env = Fehu_envs.Grid_world.make ()
RandomWalk
One-dimensional random walk on [-10, 10]. Reward is -|position|. Terminates at boundaries or after 200 steps.
- Observation: Box [1] in [-10.0, 10.0]
- Actions: Discrete 2 -- 0 = left, 1 = right
let env = Fehu_envs.Random_walk.make ()
Render Modes
Environments can optionally render their state. Pass ~render_mode when
creating the environment:
let env = Fehu_envs.Cartpole.make
~render_mode:`Ansi ()
let _obs, _info = Env.reset env ()
let _s = Env.step env (Space.Discrete.of_int 0)
(* Render after reset or step *)
match Env.render env with
| Some text -> print_endline text
| None -> ()
Supported render modes vary by environment: Ansi for text output,
Rgb_array for pixel frames, Human for interactive display.
Next Steps
- Environments and Wrappers -- custom environments, wrappers, rendering, vectorized environments
- Collection and Evaluation -- trajectory collection, replay buffers, GAE, evaluation