What MicroGPT Found — 13 Bugs and 340 Lines of Workarounds

The point of porting microgpt.py wasn’t to have a GPT written in Hew. It was to find out where the language breaks when you try to write something real. The answer: in 13 places, across four severity tiers, accounting for about 340 lines of workaround code — 31% of the file. In one case, two minor gaps combined to corrupt the entire training vocabulary.

Tier 1: The spec promises it, the compiler rejects it

These are the worst kind of bug — the spec says the feature exists, you write code that uses it, and the compiler says no.

Numeric conversion intrinsics. The spec documents .to_f64() and .to_i64() as compiler intrinsics on all numeric types. The compiler returns “no method to_f64 on i64.” The workaround was a 15-line function that decomposes integers bit-by-bit into floats, plus parallel integer and float counters everywhere you need to mix types:

// What it should be:
let ratio = step.to_f64() / total.to_f64();

// What I actually wrote:
var ratio_f = 0.0;
var step_tmp = step;
var place = 1.0;
while step_tmp > 0 {
    ratio_f = ratio_f + int_to_f64_fast(step_tmp % 10) * place;
    step_tmp = step_tmp / 10;
    place = place * 10.0;
}
ratio_f = ratio_f / int_to_f64_fast(total);

The Adam optimizer, RMSNorm, and the PRNG all have gratuitous complexity from this. The fix is a thin wrapper around one LLVM instruction — sitofp for int-to-float, fptosi for the reverse.

Vec<Vec<T>> codegen. Vec<T> where T is Vec<U> produces an MLIR error about !llvm.ptr in an integer context. This forced flat arrays with manual indexing for all matrices, and made HashMap<String, Vec<Int>> impossible — which meant hardcoded weight variable names instead of a dictionary mapping parameter names to weight vectors. The codegen needs to handle pointer-typed elements in Vec operations.

Struct field mutation through function parameters. If you pass a struct to a function and try to assign to a field, the compiler rejects it — “cannot assign field on immutable variable.” But Vec.push() and v[i] = val work fine through parameters because they mutate heap data. This forced the PRNG to wrap every scalar field in a single-element Vec<Int>:

// What the PRNG state should look like:
type MersenneTwister { index: Int; /* ... */ }

// What I had to write instead:
type MersenneTwister { index: Vec<Int>; /* ... */ }
// mt.index becomes mt.index[0] at every access site

Vec type inference in struct literals. Tape { data: Vec::new(), ... } fails with “cannot determine element type for Vec” even though the field type is Vec<f64>. Every struct with Vec fields needs a verbose pre-declaration pattern — seven extra lines in tape_new().

Tier 2: The stdlib doesn’t exist yet

std::math — 97 lines of hand-rolled Taylor series and Newton’s method for exp, log, pow, sqrt, floor, abs, max. Slower than hardware, less precise, and the accumulated rounding error makes the training loss drift from Python’s. The fix is thin wrappers around LLVM intrinsics that already exist in the backend — maybe 50 lines of codegen.

std::random — 230 lines implementing the full Mersenne Twister, including CPython’s init_by_array seeding, Box-Muller gaussian transform, Fisher-Yates shuffle, and bisect-based weighted sampling. Every line matched CPython’s implementation exactly, because the whole point was byte-identical output for verification.

Together, these two missing modules account for 327 lines — nearly a third of the entire file.

Tier 3: Language features that would make the code better

Unary negation. The spec lists unary - in the operator precedence table. The parser doesn’t accept it. Every negative number is 0.0 - x — fourteen times across the file.

break and continue. Not in the grammar. Every “break out of a while loop” needs a flag variable. This pattern directly caused the tokenizer corruption bug described in Part 33 — the done-flag approach corrupted the loop counter and put sorted elements in the wrong positions.

String ordering operators. Only == and != work on strings. No <, >, <=, >= for lexicographic comparison. The tokenizer sort had to use string_char_at to extract character codes for comparison, mixing char and String types in confusing ways.

The compound bugs

Some issues looked like one thing but were actually two language gaps stacking on top of each other.

The tokenizer sort corruption was the worst. Vec<String>.get() returns a reference — so when the insertion sort shifts elements, it overwrites the key’s source position. That’s language gap #1: no explicit copy semantics for string values in collections. And without break, the workaround pattern for early loop exit corrupted the insertion index. That’s language gap #2. Either gap alone would have been a minor annoyance. Together, they corrupted the entire vocabulary and produced a model that trained on garbage tokens.

The struct mutation issue compounded with the missing var field syntax. The spec says type Foo { var x: i32; } should declare mutable fields, but the parser rejects var in field declarations. And even if fields were mutable, function parameters are immutable, so you still couldn’t assign to them in a callee. Two gaps, same area, both blocking the natural way to write stateful types.

The numbers

Category	Lines	What it replaces
`std::math` (hand-rolled)	~97	LLVM intrinsic wrappers
`std::random` (hand-rolled)	~230	Runtime PRNG module
Numeric conversion workarounds	~30	`.to_f64()` intrinsic
`Vec<Vec<T>>` flat indexing	~20	Nested vector types
Struct mutation Vec wrappers	~15	Mutable struct parameters
Unary negation `0.0 - x`	~14	`-x`
Break-flag patterns	~10	`break` statement
Total workaround code	~340	31% of 1,092 lines

The 750-line version that would remain is comparable to a 5-6x expansion from the 120-line Python — in the same range as a port to Rust or Go without their standard libraries. The extra 340 lines are language overhead, and the Tier 1 bugs are first on the fix list. The next post walks through what happened when those fixes landed.