I recently started building my own LLM Wiki. My document structure is different enough from existing examples that I had to write my own skills from scratch. (I also just like tinkering.) Along the way I hit a few pitfalls worth documenting. These may not stay relevant for long — LLM capabilities move fast, and what requires a workaround today might just work in six months.
LLMs Make Mistakes Copying Semantic-Free Text
I track which blog posts have changed and need their summaries regenerated by computing a SHA256 hash of each post in JavaScript, writing it into the summary document’s front matter, and regenerating only when the hash doesn’t match. The skill takes the computed hash from JavaScript, gives it to the LLM and asks it to copy the hash into the front matter. Every so often, it would copy the hash with one character wrong.
I asked Claude Code why this happens. It explained that LLMs are good at predicting the next token based on context, but a SHA256 hash has no semantic content — it’s effectively random characters. With no meaning to latch onto, the LLM occasionally produces a wrong character. Claude Code updated the skill to strongly emphasize that the hash must be copied exactly. That reduced the errors. Eventually I’ll rework the flow to have JavaScript generate the front matter with the hash already in place and have the LLM fill in only the content — so there’s nothing to copy wrong.
LLMs Have No Sense of Time Passing
My skill writes the current UTC timestamp into the document front matter. I ran it once, waited an hour, ran it again, and found it had written the same timestamp as before.
An LLM running in the same conversation remembers the current time from when it first looked it up and doesn’t check again. Once it has a timestamp, it treats that as the current time indefinitely — like a stopped clock. To fix this, I had the skill explicitly call a script to get the timestamp instead of asking the LLM to produce it. Since I’m already using JavaScript in the skill, I just have the LLM run:
node -e 'console.log(new Date().toISOString())'
Self-Contradiction and Over-Cleverness
I wrote an interactive skill that finds candidate concept pairs in my wiki that could be merged (synonyms, related concepts, or parent-child relationships) and asks me whether to merge them and in which direction. The final call is mine.
The skill presents a 4-option menu for each pair:
- Merge A into B
- Merge B into A
- Dismiss (don’t suggest merging A and B again)
- Skip (suggest again next time)
I also asked the LLM to recommend a direction if it could decide, placing that option first with a “(Recommended)” label. So the expected output looks like:
- Merge B into A (Recommended)
- Merge A into B
- Dismiss
- Skip
What I actually got was:
- Dismiss (Recommended)
- Merge B into A
- Merge A into B
- Skip
That’s self-contradictory. The skill instructs the LLM to surface only pairs worth merging. If it recommends not merging, it’s undermining its own earlier judgment.
Even funnier: sometimes the LLM would bundle two pairs together and present options like:
- Skip both (neither A+B nor C+D)
- Review each pair individually
I went back and had Claude Code tighten the constraints in the skill prompt. The behaviors went away. I still don’t understand exactly why more constraints help, or at what point adding more constraints starts degrading the quality of the skill’s output.
Takeaways
All of these happened on Sonnet 4.6. Whether the same issues occur on Opus 4.7, I don’t know. That’s the frustrating thing about LLM pitfalls — you can never be sure if a problem is model-specific or version-specific. A fix that needs to live at the harness level today might be unnecessary six months from now. Whether this post ages well is genuinely unclear.