Silicon Valley Vector x Yuandong Tian, Deep Dive on AI

Mar 5
7 min read

Updated: Mar 9

“The flood is coming, yet many are still living in a 'quiet years' illusion.”

—— Yuandong Tian, Former Research Director at Meta AI

On February 28, 2026, Qinyun Cao, host of Silicon Valley Coordinates, interviewed Yuandong Tian in Silicon Valley. Having spent 11 years at Meta, Tian led pioneering research in reinforcement learning, LLM reasoning, and long context. He recently embarked on a new entrepreneurial journey as a Co-founder. This conversation covers LLM competition, memory and storage, the frontier of reasoning, and the societal impact of Agents. The following is a summary of the core content.

Yuandong Tian

PhD in Robotics from CMU. Former Research Director at Meta AI (FAIR) with 11 years of experience. He led the open-source Go AI ELF OpenGo (which defeated professional players on a single GPU), foundational work on LLM context extension Position Interpolation and Attention Sinks (ICLR 2024, 1,400+ citations), and the latent space reasoning exploration Coconut. He also co-led the Llama 4 Reasoning direction. His Google Scholar profile shows 22,000+ citations with an h-index of 74. He recently started a new venture as a Co-founder.

「Detailed Highlights」

1. LLM Competition: The Era of Distillation and Shrinking Leads

Tian Yuandong notes that since the explosion of large models in late 2022, competition has intensified, with iteration speeds approaching the "limit of human physiology." This is largely attributed to the popularity of Distillation technology: weaker models can rapidly approach leading levels by learning from the outputs of stronger models, resulting in an extremely compressed technical window.

There is a clear divergence in strategy between tech giants and startups. Big tech (like Google) utilizes cash flow advantages to maintain "First Tier" status by solving unsolved mathematical problems with models like Gemini. Startups face survival pressure; they must prove model strength to secure funding or, like OpenAI, find commercial models (such as ad integration) before capital runs out. Tian believes this "catch-up" dynamic will persist.

2. Moat Ranking: Data > Infra > Algorithms > Talent

When asked about "moats" for the next 3–5 years, Tian provided a clear ranking: Data is most important, followed by Infrastructure, with Algorithms and Talent being relatively weaker.

Data: Vertical or rare scenario data is a "hard constraint." Without it, models cannot be trained effectively.
Infrastructure: Barriers are lowering due to AI-assisted programming. Tian admitted his own coding efficiency increased at least tenfold within three months of using AI.
Algorithms & Talent: Tian states "it is hard for a secret to stay a secret for long in Silicon Valley." New solutions usually spread across the industry within 2–3 months via talent mobility. Current algorithmic changes are mostly "minor patches" rather than disruptive leaps.
Compute: The gap between giants is small; it is more a competition of capital and usage efficiency.

3. Open vs. Closed Source: The Nuclear Deterrence Theory

Tian strongly supports open-source models. He argued in 2023 that the world cannot rely solely on closed-source models. If exponentially growing AI technology is held by only a few, it will lead to severe technological hegemony and social stratification.

He uses a "nuclear weapons" analogy: when most people have access to comparable model capabilities, it creates "technological equity" or a "balance of deterrence," avoiding the risk of abuse in a unipolar world. While Meta’s shift from open to closed source is a commercial decision, open source remains a necessary countervailing force for the technical ecosystem.

4. Two Types of Memory: Context vs. Weights

Tian categorizes memory into two types:

Context Memory: Similar to short-term working memory.
Weight Memory: Similar to long-term deep memory, solidified through pre-training.

Context research aims to extend windows at a low cost. Tian’s team’s work on Position Interpolation proved that by dividing position encodings by 2 and fine-tuning, the window can be doubled at a very low cost, breaking the myth that long-context models must be entirely retrained.

The real challenge lies in Weight Memory: a model’s knowledge is "frozen" the moment it is released. Post-training fine-tuning can only optimize locally; it cannot reshape a worldview. Therefore, Continuous Learning—updating weights during inference without "catastrophic forgetting"—is a more critical frontier than simply expanding context windows.

5. From Recitation to Epiphany: The Core Challenge

Tian observed his daughter learning to count: at age 3, she memorized mechanically; but one day at age 4, she suddenly understood the relationship between numbers and could independently perform two-digit addition/subtraction. This sudden reorganization of internal representation and logical "insight" (Satori/Eureka moment) is something AI has yet to achieve.

Existing mechanisms (like Google's Nested Learning) are essentially "table lookups" and lack the human ability for "big picture" synthesis. Moving from rote memorization to structural insight is the key bottleneck to AGI, and there is currently no breakthrough solution.

6. Future Memory Forms of AGI

Regarding AGI memory, Tian leans toward "fixed capacity but refined quality" rather than infinite expansion. Simply stacking storage (like the internet) does not produce wisdom. True intelligence lies in compression and abstraction, transforming massive data into a "world model" within the weights.

The next generation of models must not only store more but learn faster and "get it" more deeply. The ideal state is like a smart child who can learn from one example, rather than a child who requires constant repetition to master new tasks.

7. Why the Context Window Has No Ceiling

Demand for context windows has no ceiling because the use case has fundamentally changed: from "chatting" (100,000 words/day) to "autonomous agents" (processing entire codebases and multi-turn tool calls). To have an AI work continuously for a week, context requirements will easily exceed millions of tokens.

Current solutions like Claude's hierarchical memory or MIT’s Recursive Language Model are transitional. While the ultimate goal is intelligent compression, brute-force window expansion remains the most direct and effective path for now, as context length determines an agent's "sustained combat capability."

8. Storage Crisis: AI is Eating the World's Memory

As model parameters soar from 70B to 500B or even 1T (e.g., DeepSeek, Kimi K2), memory (VRAM) has become the scarcest resource. Insufficient memory on a single card forces Model Parallelism, where communication latency between cards severely slows efficiency.

The H200 is more sought after than the H100 primarily because larger memory reduces the number of cards needed and lowers latency. Combined with multimodal (4K images) and long-duration agent reasoning, storage pressure is immense. Tian mentions that Google and Microsoft executives are frequently in Seoul to coordinate memory production—a testament to this physical bottleneck.

9. Path Dependency of the Pre-training Scaling Law

Tian believes the Scaling Law is still valid, but with diminishing marginal returns. Tech giants stick to this path due to "path dependency"—their infrastructure is ready, and stacking compute and data is the "safest" strategy.

However, this path faces physical limits in power and storage. More importantly, simple scaling cannot solve the "knowledge freeze" problem. Breakthroughs depend on achieving true Continuous Learning.

10. The Scaling Law of Inference and Its Upper Bound

Regarding Test-Time Scaling (increasing inference time to improve results), Tian posits that the upper bound of Reinforcement Learning (RL) is "locked" by pre-training. RL is essentially a "search amplifier" that finds the correct solution among candidate paths provided by pre-training.

If the pre-training has never seen a specific type of problem, no candidate path exists. RL would be searching an "empty set." Therefore, inference capability must be built on a solid pre-training foundation.

11. The Future of Inference: Latent Space and Parallelism

Tian is optimistic about two inference directions:

Latent Space Inference (e.g., Coconut): The Chain of Thought is no longer an inefficient language sequence, but a high-dimensional vector. One vector can compress a segment of reasoning and encode multiple paths simultaneously (like quantum superposition), increasing efficiency exponentially.
Parallel Inference: Mimicking human multi-threading (e.g., processing five requests from a boss at once). Letting the model decompose and handle sub-tasks in parallel rather than serially.

12. The Root of Hallucination: Signal Space vs. Null Space

Tian explains hallucinations mathematically: model weight space is divided into "Signal Space" (meaningful structures) and "Null Space" (meaningless noise). In-distribution data suppresses the null space, but out-of-distribution data may accidentally activate noise, leading to hallucinations. The solution lies in opening the "black box" to understand weight mechanisms at a microscopic level.

13. Security Risks of the "Crayfish" Agent

Tian uninstalled a "Crayfish" (Xiaolongxia) Agent after two hours due to security concerns. He uses an analogy: it is like sending a "clumsy child" who holds all your secrets (API keys, passwords) out to run errands; they could be tricked out of your home address by a few pieces of candy (Prompt Injection).

This is a systemic mismatch between current LLM reasoning capabilities and the permissions granted to them. He advises users to read the code and understand exactly what "keys" an agent holds before trusting it.

14. The Impact of Agents on Society

Tian predicts that agents will replace the entire category of "transactional work" (customer service, secretaries, data entry). This replacement features a "coercion effect": when competitors use AI to take orders 24/7 and plan routes automatically, those who don't will be systematically eliminated due to low efficiency.

Commercial logic will also be reconstructed. Agents have no desires; they only seek the optimal solution (price-performance ratio, speed) and are not moved by ads or visual marketing. This renders traditional e-commerce "traffic funnel" models and flashy website designs obsolete.

15. Educating the Next Generation: Intent is Irreplaceable

In the face of AI, Tian believes the core of education should be cultivating "Intent." AI is a powerful execution tool, but the initial motivation—"what to do" and "why to do it"—is uniquely human.

The value of artistic creation lies in the creator's inner impulse and experience; this is the "soul" that machines cannot replace. If one masters "Intent," AI is the ultimate amplifier; without it, humans become mere appendages to their tools.

🔒 Yuandong Tian’s Next Stop

Tian revealed he has joined a startup as a Co-founder, currently undergoing Series A funding. "We hope to announce it officially at a key moment."

🔒 To be decrypted...

Silicon Valley Vector x Yuandong Tian, Deep Dive on AI

「Detailed Highlights」

Recent Posts

Comments