Sitemap
Image by the author created with midjourney.com

Member-only story

Agents, Planning, Evaluation and AI Index Reports 2025

2 min readApr 10, 2025

--

Started writing again! :)

I recently compiled a personal reading list of standout AI articles and reports — now available as a single PDF. It includes pieces from Chip Huyen, McKinsey, Anthropic, and a great paper on LLM evaluation:

📘 Download the full reading list (PDF)

Key Takeaways:

Agent Planning Failures: From Chip Huyen’s article, I learned how agents fail at planning — e.g., using invalid tools, wrong parameters, or solving the wrong task. She also offers evaluation metrics and tool selection tips.

Subjectivity in Evaluation: The EvalGen paper highlights how evaluation criteria evolve as reviewers read more outputs. Evaluation is dynamic — rubrics shouldn’t be static but refined over time.

AI Usage Trends: Anthropic’s economic index shows that AI is still used in a small fraction of tasks across professions. Learning and direct-use cases are growing, while iterative task use is declining — likely due to model quality improving.

--

--

Lucas Soares
Lucas Soares

Written by Lucas Soares

AI Engineer. I write about AI | Tools| Data Science | Productivity. Subscribe to my Youtube channel: https://www.youtube.com/@automatalearninglab/videos

No responses yet