KMR

Part 1 in Uiua was fairly simple, with dimensions hardcoded in for brevity:

For Part 2, while it was fairly easy to make an image, I struggled a bit with how to iterate through the steps to manually find the tree. So I jumped over to Clojure, where this was straightforward.

Llama 3.2-vision with ASCII art

Having solved the problem, I then decided to play around with ollama and the llama3.2-vision model, seeing if it could find the tree in the ASCII art output. I first write a function to send a prompt to a local ollama server:

(defn ollama [q]
  (try 
    (-> 
      (http/post "http://localhost:11434/api/generate"
                 {:body (json/encode {:model "llama3.2-vision"
                                      :prompt q
                                      :stream false})})
      :body
      (json/parse-string true)
      :response)
    (catch Exception e (do
                         (print "exception querying ollama: " (.getMessage e))
                         (pprint (:body (ex-data e)))))))

(comment
  (ollama "Is the sky blue? Only say yes or no.")  ; "Yes."
  )

And a function to step through and prompt, basically the same as the manual version:

(defn find-tree [& args]
  (loop [bots (parse-inp (aoc/get-input 2024 14))
         w 101
         h 103
         i 0]
    (let [m (with-out-str (print-map w h bots))
          p (str "Does this look like a christmas tree? Only say yes or no.\n" m)
          r (ollama p)]
      (println p)
      (println i)
      (println r)
      (when (st/index-of (st/lower-case r) "no")
        (recur (mapv #(step % w h 1) bots ) w h (inc i))))
    )
  )

It did terrible, often finding false positives and giving poor reasoning (when "Only say yes or no." was omitted from the prompt).

Llama 3.2-vision with image

However, I realizd that llama3.2-vision was really trained on images, not so much on ASCII art, so I really should be rendering to an image file and including that in the prompt. This took me down a deep rabbit hole of Java image classes, building an image with java.awt.image.BufferedImage then writing it with javax.imageio.ImageIO, plus of course Base64-encoding it with java.util.Base64. A great part of Clojure being hosted on the JVM is access to this huge library, but not being so steeped in Java tradition I find it all a bit cryptic compared to doing this stuff in Go.

(defn render-map 
  "Render map to PNG and write to OutputStream os, and optionally also to file."
  [w h bots os scale & [filename]]
  (let [img (BufferedImage. (* w scale) (* h scale) BufferedImage/TYPE_INT_RGB)] 
    (doseq [b bots] 
      (.setRGB img 
               (* scale (first (:p b))) 
               (* scale (second (:p b))) 
               scale scale 
               (into-array Integer/TYPE (repeat scale 0x00ff88)) 
               0 0))
    (when filename (ImageIO/write img "png" (file filename)))
    (ImageIO/write img "png" os)))

(defn render-map-base64
  "Render map to png and return base64 encoding. Optionally also write to file."
  [w h bots scale & [filename]]
  (let [os (ByteArrayOutputStream.)
        enc (Base64/getEncoder)]
    (render-map w h bots os scale filename)
    (.encodeToString enc (.toByteArray os))))

(defn find-tree-image [& args]
  (loop [bots (mapv #(step % 101 103 1) (parse-inp (aoc/get-input 2024 14)))
         w 101
         h 103
         i 1]
    (let [m (render-map-base64 w h bots 1)
          p (str "Does this image contain a christmas tree? Respond yes or no.")
          r (ollama p {:images [m]})]
      (println p)
      (println i)
      (println r)
      (when (st/index-of r "No")
        (recur (mapv #(step % w h 103) bots ) w h (+ i 103))))))

This worked perfectly! The LLM had no problem finding the tree and gave no false positives. It wasn't the fastest, taking about a minute per query on an M3 Pro 18GB, but it was a fun exercise.

Advent of Code 2024 part 3

Llama 3.2-vision with ASCII art

Llama 3.2-vision with image

All Categories