LLM code generation—a semi-structured process
- Published: 2024-12-09 15:24
- Updated: 2024-12-09 15:25
For context, Manuel recently facilitated a workshop: “10x Development: LLMs For the working Programmer”. Which I wasn’t aware of, yet. Above YT session basically is a wrap of our chat. And the notes I took below.
🤔How did you learn instructing more specifically?¶
- “I experimented with the language I use A LOT”
- “Basically (ab)used ChatGPT for 10 hours since its release, every day”
- Trial and error: “lots of paraphrasing and regenerating”
- 📝 Instructing precisely is an emerging DIY skill
- —> #1 point of failure: the lack of bespoke skill
- 📝 Instructing precisely is an emerging DIY skill
🤔What’s your process like today?¶
- “Preamble: LLM’s are good at translating one kind of language to another—they don’t think or reason.”
- “The most straight forward way of learning is regenerating as often as you can”
- “Refine—paraphrase: use different words, sentences, throw words away”
- “Regenerate”
- “Refine—paraphrase: use different words, sentences, throw words away”
- “Also: which documents are important for humans?”
- “A valuable document for a human is also a valuable document for the LLM”
- “EG: design specs, tutorials, documentation, tickets, meeting notes…”
- “Instead of directly jumping to generating code, which usually burns through tokens:”
- “Most of the time, I start with inventing a ‘new programming language’ using YAML DSL specs”
- “So that I have sth specific to target, validate and generate code from”
- “Then I let the LLM write its own code generator, validator and interpreter”
🤔How do you define a new programming language?¶
“For example:” (using claude.ai)
🤔What does an LLM require, to understand a new language?¶
- “A consistent spec. Since above is generated by the LLM, it usually is consistent.”
He then provided a specific example using my now-page¶
- Create the YAML DSL:
- Preview the entry data:
- Generate more dummy content generator with preview:
- Add colors and Emojis, generate a renderer:
- Rendered output:
💡“My approach is NOT ‘goal → code’, …”¶
- “… it’s: ‘Goal → data structure → code for data structure’ ”
- “A change in code first requires a change in data structure”
- 📝He then mentioned opening a new chat to demo sth else
🤔Wait, how do you know you need a new chat?¶
- “Whenever I change the structure or approach a new task”
- 💡 “My chats are barely longer than 2-3 instructions”
- “Whenever things don’t immediately work, I know that I did not understand the task well enough to describe it”
- “The longer the context, the slower the chat, the more irritation can happen”
- 📝Every input is priming the LLMs direction
- “Autoregression: The model must generate output that’s useful for the model to generate more useful output”
- “Output is not useful? New chat, revisit your instructions/data”
- “If an output meets the goal, let it generate a spec/tutorial/changelog/TIL you can use for the next chat”
- Refers to these .md files as example for this
💡“One more thing to internalize”¶
- “Your instructions create the LLM’s reality”
- “If you tell the LLM to be the APP you want to build, you create a ‘living’ framework”
- “So you work towards the point of telling it to build itself”
How I met Manuel¶
- He joined my Discord community in 2021
- Followed him on Mastodon
- Where he’s sharing a lot about Large Language Models
- Blogging @the.scapegoat.dev & @dev.to/wesen