Friday Fixes: Housekeeping the Homelab and Hub
Some weeks you ship a big feature. Other weeks you sweep the floor so the big features keep working. This was a floor-sweeping week — two completely unrelated workstreams that both needed attention.
Track one: the homelab's local LLM stack hadn't been touched in a month. Models were stale, llama.cpp was 469 builds behind, and the embedding model was a generation old.
Track two: the vacation planning site I open-sourced needed to actually be useful for a group trip. Calendar sync, activity voting, expense tracking — the features that turn a brochure into a tool.
Track three: the Substack syndication pipeline I wrote about earlier this week? Turns out doing it once was the easy part. Doing it every time surfaced two more undocumented quirks and required a GitHub Action to paper over them.
None of these stories is glamorous on its own. Together they're a snapshot of what maintenance week looks like when you're building with an agent.
Part 1: Homelab Model Refresh
The homelab runs llama.cpp on an RTX 5090 with six switchable models. The agent audited everything and came back with a report card:
| Component | Before | Verdict |
|---|---|---|
| llama.cpp | b8933 | 469 builds behind |
| Qwen (daily driver) | 3.5 35B-A3B | 3.6 available |
| Embedding | nomic-embed v1.5 | v2-moe available |
| Gemma 4, Devstral, DeepSeek | Current | No action needed |
| Codestral | v0.1 (2024) | Dead end — Mistral pivoted to Devstral |
Three downloads, ~38 GB total: Qwen 3.6, nomic-embed v2-moe, and a new addition — Qwen3-Coder-30B-A3B, a coding-specialized MoE that fits at 17 GB.
The Quant Trap
The interesting discovery was about quant provenance. Our Qwen model uses UD-Q4_K_XL quantization — the "XL" quants use higher precision on attention layers while keeping MoE expert layers smaller. These are unsloth-specific. Bartowski (the other major GGUF publisher) doesn't offer them. The agent initially found the bartowski version and we had to redirect it to unsloth to get the same quant type we were already running.
This matters because quant format affects output quality in ways that aren't obvious from the model name alone. Q4_K_M and Q4_K_XL are both "4-bit" but they allocate precision differently. Swapping quant types during an upgrade is an uncontrolled variable.
Script Updates
The homelab's model switching lives in a shell script (llm-switch.sh) that maps model names to file paths and llama-server flags. Updates: Qwen path from 3.5 to 3.6, new qwen-coder case with 128K context, embedding path from v1.5 to v2-moe, Codestral marked [legacy].
Gotcha: Pasting heredoc scripts into the terminal mangled backslashes and quoting. We switched to writing the scripts in the workspace, pushing to GitHub, and giving me a git pull && cp one-liner. Lesson: don't paste shell scripts through chat — commit them.
After State
| Component | Before | After |
|---|---|---|
| llama.cpp | b8933 | b9402 |
| Generation model | Qwen 3.5 | Qwen 3.6 |
| Embedding model | nomic v1.5 (262 MB) | nomic v2-moe (914 MB) |
| Switchable models | 5 | 6 (added qwen-coder) |
| VRAM | 26,262 MiB | 26,682 MiB (+420 MiB) |
About 20 minutes wall clock from audit to fully updated, zero downtime. The old models still serve until you restart the service with the new binary.
Part 2: Vacation Hub Feature Sprint
The vacation hub is a forkable trip-planning site — deploy to Vercel, run the setup wizard, and your group has a private site for travel notes, itinerary, lodging, activities, photos. I wrote about open-sourcing it last week. This week was about making it useful.
Four features across three days, 11 commits, 3,484 lines added. But the features aren't the interesting part. The bugs are.
Calendar Sync (the straightforward one)
People need trip events in their phone's calendar. Two options: download a .ics file (one-time import) or subscribe to a URL (auto-syncing).
The download is trivial — click a button, get a file. The subscription is the interesting engineering problem. Google Calendar, Apple Calendar, and Outlook all fetch subscription URLs from their servers. No browser, no cookies. So the endpoint needs an auth mechanism that works without a session.
We went with a deterministic HMAC token: HMAC-SHA-256('calendar-subscribe', VACATION_HUB_SECRET). The export endpoint accepts either a cookie (for browser downloads) or a ?token= param (for calendar clients). No expiry — a time-limited token would silently break subscriptions when it expires and there's no user present to re-authenticate.
The iCal generator itself is 202 lines, built from scratch against RFC 5545. The subtle part is line folding — the spec requires max 75 octets per line, not characters. You can't just .slice(75) because you might split a UTF-8 multi-byte character. The fold function walks backward from the cut point checking continuation bytes. Most iCal libraries get this wrong and corrupt non-ASCII event names.
Activity Voting (the bug factory)
Reddit-style upvote/downvote on suggested activities. Name-based identity (localStorage, no accounts). Upsert voting so changing your mind is idempotent.
This feature worked perfectly in development and completely failed in production. Twice, for two different reasons.
Bug 1 — The Trailing Slash Massacre: next.config.ts has trailingSlash: true, which makes Next.js issue 308 redirects from /api/foo to /api/foo/. The redirect preserves the HTTP method but the browser drops the request body. Every POST, PUT, and DELETE arrived at the API with an empty body. GET requests (page loads, data fetching) worked fine, so the site looked healthy — only mutations were silently failing.
The fix: add trailing slashes to all 28 fetch() calls across 12 files. Eight minutes to fix, 40 minutes to diagnose. trailingSlash: true is a foot-gun for API routes — fine for page navigation, lethal for fetch().
Bug 2 — The Table That Never Existed: After fixing trailing slashes, voting still didn't work. The activity_votes table didn't exist on production. It existed in development because the dev database didn't have duplicate activity titles.
The initializeDatabase() function runs CREATE TABLE statements sequentially in a single try block. After creating the activity_suggestions table, it tries to create a unique index on the title column. Production had duplicate titles (imported via LLM-generated suggestions). The index creation threw, the catch block caught it, and the function exited before reaching CREATE TABLE activity_votes.
The debugging journey: deploy a temporary /api/db/debug/ endpoint → confirm the table is missing → trace the init function → find the ordering dependency → wrap the index creation in its own try/catch → re-run init → delete the debug endpoint. Two commits, two minutes apart.
The lesson: every DDL statement in an init function should be its own try/catch. A failure to create an index on table A should never prevent table B from being created.
PDF Upload Fix (the serverless trap)
This one predated the feature sprint but came up during testing. PDF itinerary uploads worked locally, failed on Vercel with a cryptic module error.
The pdf-parse npm package bundles an ancient version of PDF.js that uses dynamic require(). Vercel's bundler traces imports statically and prunes anything it can't resolve. The module exists in node_modules locally but vanishes after bundling.
Bonus discoveries while debugging:
- The upload endpoint returned "Something went wrong" for all errors. We had to add real error logging before we could even see the pdf-parse failure.
- iOS Safari sends an empty MIME type for PDFs. The validation rejected them.
- Vercel has a 4.5MB body limit for serverless functions. The original limit was 10MB.
Replaced pdf-parse with unpdf (serverless-compatible). Three files changed, 21 insertions, 38 deletions. The kind of fix that's trivial once you know the root cause and impossible until you do.
Expense Management (the big one)
2,108 lines across 13 files. Track who paid for what, scan receipts with AI, show who owes whom.
The receipt scanning supports three LLM providers — same ones the site already uses for itinerary parsing. Each has its own quirks: OpenAI accepts image URLs directly, Anthropic and Gemini require base64 encoding. OpenAI and Gemini support structured JSON output, Anthropic requires regex extraction from prose. For PDFs, all three get extracted text rather than the visual layout.
The design pivot that mattered: The original plan had per-expense split counts. "This $200 dinner was split 4 ways." In practice, the form was cluttered and the answer was almost always the same number. We changed to a global "Splitting between N people" control at the top of the page. The form went from three columns to two. Settlement computation moved from a server endpoint to a useMemo hook — because the split count is a UI concern (you might flip between values while looking at the numbers), not persistent data.
We built the server endpoint, shipped it, realized it was wrong, moved the logic client-side, and deleted the endpoint. Normal lifecycle.
The Cleanup
After the feature sprint, we went back and deleted dead code:
/api/expenses/settle/route.ts— settlement moved client-side/api/og-image/route.ts— only consumer was the activity POST handler, which we'd stripped during the Things to Do redesign- The OG image fetch block in the activity POST handler itself
363 lines deleted. We also went back to the expense feature's design doc and annotated it with what actually shipped versus what was planned. There's something honest about marking your own plan with "this part we built differently." The plan is the record of what you thought before you knew better. The code is what you actually shipped.
Part 3: Automating Substack Syndication
I wrote up the initial Substack import earlier this week — 13 curated posts, an RSS feed filtered by a syndicate: true frontmatter flag, and a GitHub mirror repo to work around Substack rejecting feeds from our domain. That got the backlog in. This week's Thursday Thoughts post was the first one I needed to push after the initial import.
It didn't go smoothly.
Two more dedup quirks
Quirk 1 — per-feed-URL dedup. Substack doesn't just dedup by GUID. It dedupes by feed URL. If you add a new post to syndicate.xml and re-import the same URL, Substack silently skips the new item. The existing 13 posts aren't reimported (good), but the new 14th post isn't imported either (bad). No error. The import API returns 200 and reports it found 14 posts. It just doesn't do anything with the new one.
The workaround: a separate single-import.xml file containing only the new post, with a timestamped GUID that Substack has never seen. Different URL, different GUID, different dedup bucket.
Quirk 2 — Cloudflare blocks GitHub Actions. The live feed at vibescoder.dev/syndicate.xml returns 403 when fetched from GitHub Actions runners. Same IP reputation issue that made Substack reject the feed in the first place — Vercel sits behind Cloudflare, and Cloudflare's bot protection doesn't love datacenter IP ranges. curl from a laptop works fine. curl from ubuntu-latest on Actions gets a wall.
The workflow
The automation lives as a GitHub Action in the content repo (where posts are pushed). On any push to content/posts/:
- Wait 90 seconds for Vercel to rebuild
- Fetch the live
syndicate.xml(with retry and user-agent headers to appease Cloudflare) - Clone the mirror repo and diff GUIDs to find new posts
- Update
syndicate.xmlin the mirror, preserving existing GUID busts from prior imports - Generate
single-import.xmlwith a unique timestamped GUID - Push to the mirror repo
- Post a summary in the Actions run with the Substack import URL
The last step is manual — you paste the URL into Substack's import UI. Substack's import API exists but requires session authentication, and there's no official way to get a token. Fully automated posting would need the python-substack library, which reverse-engineers the auth flow. That's a project for when I have more than one subscriber.
For now: push a post with syndicate: true, wait for the Action to run, paste one URL. Three minutes end-to-end, zero chance of forgetting to update the mirror.
By the Numbers
Homelab:
- 3 models downloaded (38 GB)
- 469 llama.cpp builds caught up (b8933 → b9402)
- 6 switchable models (was 5, added qwen-coder)
- 420 MiB VRAM increase from the embedding upgrade
- ~20 minutes wall clock from audit to fully updated
Vacation Hub:
- 11 commits over 3 days
- 35 files changed, 3,484 lines added, 702 deleted
- 4 features shipped (calendar sync, voting, page redesign, expenses)
- 3 production bugs fixed (trailing slash, missing table, pdf-parse)
- 28 fetch() calls fixed with trailing slashes in one commit
- 202 lines for a from-scratch RFC 5545 iCal generator
- 2,108 lines for expense management in a single commit
- 363 lines deleted during cleanup
- 1 npm package replaced (pdf-parse → unpdf)
- 0 user accounts — names in localStorage and a prayer
Substack Syndication:
- 2 undocumented quirks discovered (per-feed-URL dedup, Cloudflare blocking Actions)
- 1 GitHub Action to auto-sync the mirror repo on every content push
- 1 manual step remaining (paste the import URL into Substack)
- ~3 minutes end-to-end per syndicated post, down from ~15 minutes manual