LLM code generation—a semi-structured process

Published: 2024-12-09 15:24
Updated: 2024-12-09 15:25

For context, Manuel recently facilitated a workshop: “10x Development: LLMs For the working Programmer”. Which I wasn’t aware of, yet. Above YT session basically is a wrap of our chat. And the notes I took below.

🤔How did you learn instructing more specifically?¶

“I experimented with the language I use A LOT”
- “Basically (ab)used ChatGPT for 10 hours since its release, every day”
- Trial and error: “lots of paraphrasing and regenerating”
  - 📝 Instructing precisely is an emerging DIY skill
    - —> #1 point of failure: the lack of bespoke skill

🤔What’s your process like today?¶

“Preamble: LLM’s are good at translating one kind of language to another—they don’t think or reason.”
“The most straight forward way of learning is regenerating as often as you can”
- “Refine—paraphrase: use different words, sentences, throw words away”
  - “Regenerate”
“Also: which documents are important for humans?”
- “A valuable document for a human is also a valuable document for the LLM”
- “EG: design specs, tutorials, documentation, tickets, meeting notes…”
“Instead of directly jumping to generating code, which usually burns through tokens:”
“Most of the time, I start with inventing a ‘new programming language’ using YAML DSL specs”
- “So that I have sth specific to target, validate and generate code from”
- “Then I let the LLM write its own code generator, validator and interpreter”

🤔How do you define a new programming language?¶

“For example:” (using claude.ai)

🤔What does an LLM require, to understand a new language?¶

“A consistent spec. Since above is generated by the LLM, it usually is consistent.”

He then provided a specific example using my now-page¶

Create the YAML DSL:
Preview the entry data:
Generate more dummy content generator with preview:
Add colors and Emojis, generate a renderer:
Rendered output:

💡“My approach is NOT ‘goal → code’, …”¶

“… it’s: ‘Goal → data structure → code for data structure’ ”
- “A change in code first requires a change in data structure”
📝He then mentioned opening a new chat to demo sth else

🤔Wait, how do you know you need a new chat?¶

“Whenever I change the structure or approach a new task”
💡 “My chats are barely longer than 2-3 instructions”
“Whenever things don’t immediately work, I know that I did not understand the task well enough to describe it”
- “The longer the context, the slower the chat, the more irritation can happen”
- 📝Every input is priming the LLMs direction
“Autoregression: The model must generate output that’s useful for the model to generate more useful output”
- “Output is not useful? New chat, revisit your instructions/data”
“If an output meets the goal, let it generate a spec/tutorial/changelog/TIL you can use for the next chat”
- Refers to these .md files as example for this

💡“One more thing to internalize”¶

“Your instructions create the LLM’s reality”
“If you tell the LLM to be the APP you want to build, you create a ‘living’ framework”
- “So you work towards the point of telling it to build itself”

How I met Manuel¶

He joined my Discord community in 2021
Followed him on Mastodon
- Where he’s sharing a lot about Large Language Models
Blogging @the.scapegoat.dev & @dev.to/wesen