A study in Nature Human Behaviour compares the theory of mind capabilities of GPT-3.5, GPT-4, and LLaMA2-70B against humans, finding that AI models show varying degrees of success across different tasks.
Health systems band together to test and publicly rank top AI models
Listen to the article 4 min This audio is auto-generated. Please let us know if you have feedback. Since the launch of ChatGPT in 2022,