1
Feature Story
Ask HN: Can anybody clarify why OpenAI reasoning now shows non-English thoughts?
Jun 13, 2025 · news.ycombinator.comThe discussion points to a broader question about how AI models are trained and why such language mix-ups happen. It suggests that these "errors" might be linked to the diverse datasets used for training, which could include multilingual content. The article invites readers to explore the underlying causes of these language insertions and whether they are a result of specific training strategies or inherent complexities in language processing by AI models.
Key takeaways
- Google's Bard/Gemini has been observed to insert random Hindi/Bengali words in its outputs.
- A specific instance was noted where Bengali words appeared in an o3-pro thought process.
- These occurrences raise curiosity about the training methods or reasons behind the inclusion of alternate languages.
- Similar language "errors" have been reported across multiple different models.