-
Pl
chevron_right
Luis Villa: Three LLM-assisted projects
news.movim.eu / PlanetGnome • 5 days ago - 07:24 • 7 minutes
Some notes on my first serious coding projects in something like 20 years, possibly longer. If you’re curious what these projects mean , more thoughts over on the OpenML.fyi newsletter.
TLDR
A GitHub contribution graph, showing a lot of activity in the past three weeks after virtually none the rest of the year.
News, Fixed
The “ Fix The News ” newsletter is a pillar of my mental health these days, bringing me news that the world is not entirely going to hell in a handbasket. And my 9yo has repeatedly noted that our family news diet is “broken” in exactly the way Fix The News is supposed to fix—hugely negative, hugely US-centric. So I asked Claude to create a “newspaper” version of FTN — a two page pdf of some highlights. It was a hit.
So I’ve now been working with Claude Code to create and gradually improve a four-days-a-week “News, Fixed” newspaper. This has been super-fun for the whole family—my wife has made various suggestions over my shoulder, my son devours it every morning, and it’s the first serious coding project I’ve tackled in ages. It is almost entirely strictly personal (it still has hard-coded Duke Basketball schedules) but nevertheless is public and FOSS . (It is even my first usage of reuse.software —and also of SonarQube Server !)
Example newspaper here .
No matter how far removed you are from practical coding experience, I cannot recommend enough finding a simple, fun project like this that scratches a human itch in your life, and using the project to experiment with the new code tools.
Getting Things Done assistant
While working on News, Fixed a friend pointed out Steve Yegge’s “ beads ”, which reimagines software issue tracking as an LLM-centric activity — json-centric, tracked in git, etc. At around the same time, I was also pointed at Superpowers —essentially, canned “skills” like “teach the LLM, temporarily, how to brainstorm”.
The two of these together in my mind screamed “do this for your overwhelmed todo list”. I’ve long practiced various bastardized versions of Getting Things Done, but one of the hangups has been that I’m inconsistent about doing the daily/weekly/nth-ly reviews that good GTD really relies on. I might skip a step, or not look through all my huge “someday-maybe” list, or… any of many reasons one can be tired and human when faced with a wall of text. Also, while there are many tools out there to do GTD, in my experience they either make some of the hardest parts (like the reviews) your problem, or they don’t quite fit with how I want to do GTD, or both. Hacking on my own prompts to manage the reviews seems to fit these needs to a T.
I currently use Amazing Marvin as my main GTD tool. It is funky and weird and I’ve stuck with it much longer than any other task tracker I’ve ever used. So what I’ve done so far:
- wrapped the Marvin API to extract json
- discovered the Marvin API is very flaky, so done some caching and validation
- written a lot of prompts for the various phases/tasks in GTD. These work to varying degrees and I really want to figure out how to collaborate with others on them, because I suspect that as more tools offer LLM-ish APIs ( whoa, todoist! ) these prompts are where the real fun and action will be.
This is all read-only right now because of limitations in the Marvin API but for various reasons I’m not yet ready to embark on building my own entire UI. So this will do for now. But this code, therefore, is very limited to me. The prompts on the other hand…
Note that my emphasis is not on “do tasks”, it is on helping me stay on priority. Less “chief of staff”, more “executive assistant”—both incredibly valuable when done well, but different roles. This is different from some of the use examples for Yegge’s Beads, which really are around agents.
Also note: the results have been outstanding. I’m getting more easily into my doing zone, I think largely because I have less anxiety about staring at the Giant Wall of Tasks that defines the life of any high-level IC. And my projects are better organized and todos feel more accurate than they have been in a long time, possibly ever.
a note on LLMs and issue/TODO tracking
It is worth noting that while LLMs are probabilistic/lossy, so they can’t find the “perfect” next TODO to work on, that’s OK. Personal TODO and software issue tracking are inherently subjective, probabilistic activities—there is no objectively perfect “next best thing to work on”, “most important thing to work on”, etc. So the fact that an LLM is only probabilistic in identifying the next task to work on is fine—no human can do substantially better. In fact I’m pretty sure that once an issue list is past a certain point, the LLM is likely to be able to do better— if (and like many things LLM, this is a big if) you can provide it with documented standards explaining how you want to do prioritization. (Literally one of the first things I did at my first job was write standards on how to prioritize bugs—the forerunner of this doc —so I have strong opinions, and experience, here.)
Skills for license “concluded”
While at a recent Linux Foundation event, I was shocked to realize how many very smart people haven’t internalized the skills/prompts/context stuff. It’s either “you chat with it” or “you train a model”. This is not their fault; it is hard to keep up!
Of course this came up most keenly in the context of the age-old problem of “how do I tell what license an open source project is under”. In other words, what is the difference between “I have scanned this” and “I have reached the zen state of SPDX’s ‘ concluded ’ field”.
So … yes, I’ve started playing with scripts and prompts on this. It’s much less further along than the other two projects above, but I think it could be very fruitful if structured correctly. Some potentially big benefits above and beyond the traditional scanning and/or throw a lawyer at it approaches:
- reporting: my very strong intuition, admittedly not yet tested, is that plain-English reports on factors below, plus links into repos, will be much easier for lawyers to use as a starting point than the UIs of traditional license-scanner tools. And I suspect ultimately more powerful as well, since they’ll be able to draw on some of the things below.
- context sensitivity: unlike a regexp, an LLM can likely fairly reliably understand from context some of the big failures of traditional pattern matching like “this code mentions license X but doesn’t actually include it”.
- issue analysis and change analysis: unlike traditional approaches, LLMs can look at the change history of key files like README and LICENSE and draw useful context from them. “oh hey README mentioned a license change on Nov. 9, 2025, here’s what the change was and let’s see if there are any corresponding issues and commit logs that explain this change” is something that an LLM really can do. (Also it can do that with much more patience than any human.)
ClearlyDefined offers test data on this, by the way — I’m really looking forward to seeing if this can be made actually reliable or not. (And then we can hook up reuse.software on the backend to actually improve the upstream metadata…)
But even then, I may not ever release this. There’s a lot of real risks here and I still haven’t thought them through enough to be comfortable with them. That’s true even though I think the industry has persistently overstated its ability to reach useful conclusions about licensing, since it so persistently insists on doing licensing analysis without ever talking to maintainers.
More to come?
I’m sure there will be more of these. That said, one of the interesting temptations of this is that it is very hard to say “something is done” because it is so easy to add more. (eg, once my personal homebrew News Fixed is done… why not turn it into a webapp? once my GTD scripts are done… why not port the backend? etc. etc.) So we’ll see how that goes.