Models cannot reliably self-generate effective skills
Score (%) comparing no skills, self-generated skills, and curated skills. Gemini CLI was not evaluated in the self-generated condition.
No Skills
Self-Generated
Curated Skills
50
40
30
20
10
0