ZIO
ZIO Consulting IT architect and software architecture coach.

More Resources about AI You Might Want to Review

(Updated: )
Reading time: 2 minutes
More Resources about AI You Might Want to Review

I came across additional reading resources since I posted “Ten Resources about AI in Software Engineering You Do Not Want to Miss”. You might find them useful too.

Directions and Calls to Action

“Quick but worthwhile links” on Martin Fowler’s website references the following blog posts:

“On Memory” then points at these two posts:

Martin Fowler’s “Some thoughts on LLMs and Software Development” came out on Aug 28 too (shortly after I had published this post).

Experiments and Evaluations

Experiment results and advice how to organize and report empirical AI/LLM evaluations can be found in:

  • Vaughn Vernon reports possitive experience with Claude Code on LinkedIn; interesting discussion and details in comments.
  • A learning use case is reported in “How I use LLMs to learn new subjects”; other AI posts by Sean Goedecke are informative too.
  • “How far can we push AI autonomy in code generation?”, by Birgitta Böckeler:
    • The test case is an eigth-step agentic workflow to build a CRUD-based Spring Boot application.
    • The Roo Code fork Kilo Code is used, orchestrating subtasks with own context windows.
    • Significant issues were observed in the results, including (1) overeagerness; (2) gaps in the requirements filled with assumptions; (3) declaring success in spite of red tests; and (4) static code analysis issues.
    • Possible mitigations are suggested.
  • “Evaluation Guidelines for Empirical Studies in Software Engineering involving Large Language Models”, by Sebastian Baltes and 18 co-authors:
    • Large Language Models (LLMs) are positioned as study objects and as tools.
    • The eight guidelines are: “(1) explicitly declare when and how LLMs are used; (2) report model versions, configuration, and fine-tuning details; (3) describe the complete tool architecture beyond the model; (4) release prompts, their development, and, where possible, interaction logs; (5) validate LLM outputs with humans; (6) include an open LLM as a baseline; (7) select appropriate baselines, benchmarks, and metrics; and (8) report study limitations and mitigations, including costs, potential biases, and environmental impact.”

What are your thoughts on the guidelines? Have you experimented?

Skeptic Views and Warnings

Some voices from very different communities and viewpoints are:

The “AI 2027” website makes predictions about the future, with two endings, race and slowdown. Utopia or dystopia?

Wrap Up

It is rather hard to escape the topic these days; opinions and positions vary greatly. Every software engineer and IT architect needs one! These resources and the ones in the previous post helped me shape my (current) perspective. Follow the links to build or adjust yours!

– Olaf (ZIO)

Editorial information: No AI was used to write this post. 😉