aimiox Insights
Welcome to a world where artificial intelligence and connectivity converge to shape the future. Here, we explore the transformative power of intelligent systems and uncover the innovations redefining a new era.
29 Nov 2024. State of AI
This year has been monumental for transformer architecture based models, marking significant milestones and breakthroughs. We've witnessed the emergence of multiple state-of-the-art (SOTA) models, pushing the boundaries of what's possible while introducing reasoning and multimodal capabilities—the ability to process and integrate diverse types of data like text, images, audio, and more.
or
2025
will be the year of post-training innovations and agentic flows as pre-training scaling comes to an end in 2024.
One standout achievement was OpenAI's release of its first reasoning model, o1, which demonstrated remarkable proficiency in domains like mathematics, coding, and science, showcasing the next evolution of intelligent systems.
Meanwhile, smaller models have risen to prominence, matching or even exceeding the performance of previous-generation models that were 10x larger. This leap in efficiency is a game-changer, as it significantly reduces the cost per million tokens, making AI more accessible and scalable for real-world applications.
However, we are particularly excited about the ever-growing and advancing open-source ecosystem. Open-source models are increasingly capable of competing with closed-state-of-the-art (SOTA) models, while frameworks are enabling innovations like fine-tuning, quantization for efficient inference, retrieval-augmented generation (RAG), and agent flows. These advancements are democratizing access to powerful AI tools and making cutting-edge capabilities more widely available.
We’ll dive deeper into these topics later, but it's worth noting that the rumors of pre-training scaling hitting a wall could have serious implications for the field. Until now, the prevailing industry assumption has been that more parameters, more data, and more compute would inevitably lead to increasingly better models. If this paradigm is shifting, the open-source ecosystem will likely emerge as one of the winners, thanks to its focus on innovation and a “do more with less” attitude as we see in below table:
Model | Artificial Analysis Quality Index | Scientific Reasoning & Knowledge (GPQA) | Coding (HumanEval) |
---|---|---|---|
o1-preview | 85 | 67% | 95% |
o1-mini | 82 | 58% | 93% |
Claude 3.5 Sonnet (Oct'24) | 80 | 58% | 93% |
Gemini 1.5 Pro (Sep'24) | 80 | 61% | 87% |
GPT-4o (Aug'24) | 77 | 51% | 90% |
Qwen-2.5 72B | 75 | 50% | 85% |
Llama-3.1 405B | 72 | 50% | 82% |
GPT-4o-mini | 71 | 43% | 86% |
GPT-4o (Nov'24) | 71 | 39% | 90% |
Qwen-2.5 Coder 32B | 70 | 37% | 85% |
Claude 3.5 Haiku | 69 | 41% | 85% |
Llama 3.2 (90B Vision) | 67 | 42% | 75% |
Deepseek-2.5 (MoE 238B) | 66 | 42% | 86% |
Llama 3.2 (11B Vision) | 53 | 25% | 68% |
GPT-3.5 Turbo (375B) | 52 | 30% | 69% |
Looking ahead here is what we expect for 2025:
- Intelligence will be too cheap!
- I need more agents!
- Do I believe what I see?
2025 will be an exciting shift in the AI landscape. Smaller, highly capable models are evolving rapidly, and with the arrival of new GPUs and inference chips, the costs are dropping faster than ever. We see that 1M tokens is now drifting down to just $0.1 for mainstream models, and renting a powerful H100 GPU can now cost less than $1/hour. Let's not forget, we’ve got Apple’s M4 chip already making waves, the RTX50 series on the horizon, and Intel Gaudi 3 cards promising to bring affordable private AI servers within everyone’s reach.
This year, we’ve seen agentic flows starting to mature, especially in areas like customer support, coding, and web search. With the availability of more multimodal models, advanced reasoning capabilities, and long-context window models, alongside powerful open-source agent frameworks, we’re excited to see how agent flows will evolve. We expect them to grow significantly and eventually replace more traditional RAG applications.
While waiting for OpenAI’s SORA release, multiple companies have introduced realistic AI image and video generation models, with some even releasing open-source versions. The lines between AI and reality continue to blur, and we expect this space to witness significant advancements in both quality and accessibility. It’s becoming clear that these technologies will redefine industries like entertainment, design, and marketing in ways we’re only beginning to imagine.
2025 is shaping up to be the year of locally deployed open-source models and the rise of agentic flows. The question is—are you ready?