Getting Started with sowilo
This guide shows you how to use sowilo for image processing and computer vision.
Installation
Sowilo isn't released yet. When it is, you'll install it with:
opam install sowilo
For now, build from source:
git clone https://github.com/raven-ml/raven
cd raven
dune build sowilo
Your First Image Processing Pipeline
Here's a working example that loads an image and detects edges:
open Sowilo
let () =
(* Load image as nx array *)
let img = Nx_io.load_image "photo.jpg" in
(* Convert to Rune tensor for processing *)
let tensor = Rune.of_bigarray (Nx.to_bigarray img) in
(* Apply operations *)
let processed =
tensor
|> to_grayscale (* Convert to grayscale *)
|> gaussian_blur ~ksize:5 ~sigma:1.0 (* Smooth *)
|> canny ~low:50. ~high:150. (* Detect edges *)
in
(* Convert back to nx and save *)
let result = Nx.of_bigarray (Rune.to_bigarray processed) in
Nx_io.save_image result "edges.png"
Key Concepts
Tensors, not custom types. Images are just tensors with shape [height; width; channels]
. This means any tensor operation works on images.
Dimension conventions:
-
Single image:
[H; W; C]
where C is 1 (grayscale) or 3 (RGB) -
Batch of images:
[N; H; W; C]
for batch size N
Type safety. Operations enforce correct tensor types:
-
Most filters work on
float32
tensors in range [0, 1] -
Use
to_float
andto_uint8
to convert between formats -
Some operations (like Sobel) output
int16
for derivatives
Differentiable. Since sowilo is built on Rune, all operations can be differentiated. Train neural networks that include classical CV operations.
Common Operations
(* Color space conversions *)
let gray = to_grayscale color_img
let bgr = rgb_to_bgr rgb_img
(* Type conversions *)
let float_img = to_float uint8_img (* [0,255] -> [0.0,1.0] *)
let uint_img = to_uint8 float_img (* [0.0,1.0] -> [0,255] *)
(* Basic transformations *)
let flipped = flip_vertical img
let cropped = crop img ~x:100 ~y:50 ~width:200 ~height:150
let resized = resize img ~height:224 ~width:224 ~interpolation:Bilinear
(* Filtering *)
let blurred = gaussian_blur img ~ksize:5 ~sigma:1.0
let denoised = median_blur img ~ksize:3
let averaged = box_filter img ~ksize:3
(* Morphology *)
let kernel = get_structuring_element Rect ~ksize:3
let eroded = erode img kernel
let dilated = dilate img kernel
(* Thresholding *)
let binary = threshold img ~thresh:128. ~maxval:255. ~typ:Binary
(* Edge detection *)
let grad_x, grad_y = sobel img ~dx:1 ~dy:0 ~ksize:3
let edges = canny img ~low:100. ~high:200.
Image I/O
Sowilo works with nx for image I/O:
(* Load images *)
let img = Nx_io.load_image "input.jpg"
(* Save images *)
Nx_io.save_image processed "output.png"
(* Display with hugin *)
let fig = Hugin.imshow img ~title:"Processed Image" in
Hugin.show fig
Working with Batches
Process multiple images at once:
(* Stack images into batch: [N; H; W; C] *)
let batch = Rune.Tensor.stack images ~axis:0
(* Apply operations - they broadcast over batch dimension *)
let processed_batch = gaussian_blur batch ~ksize:3 ~sigma:1.0
(* Split back to individual images *)
let processed_images = Rune.Tensor.unstack processed_batch ~axis:0
Integration with Deep Learning
Sowilo operations can be part of neural networks:
(* Preprocessing for neural network *)
let preprocess img =
img
|> to_float
|> resize ~height:224 ~width:224 ~interpolation:Bilinear
|> to_grayscale
|> gaussian_blur ~ksize:3 ~sigma:0.5
(* Use in training - gradients flow through! *)
let augmented = Model.forward model (preprocess input)
Next Steps
Check out the examples in sowilo/example/
for complete image processing pipelines including:
- Edge detection workflows
- Image filtering and denoising
- Morphological operations
- Threshold techniques
When Rune's JIT compiler lands, these operations will automatically compile to efficient GPU kernels.