普通视图

初涉 ML Workflow 系统:Kubeflow Pipelines、Flyte 和 Metaflow

2025年5月19日 07:18

入职 Coupang 两个月了,第一个月主要上手和开发 BOS(Business Operating System)系统,第二个月开始调研选型 ML Workflow 平台。前者目前来说相对比较简单,后者对我来说是一个新坑,也比较有意思,随便写写技术上的体会。

先扯点题外话,其实这次求职有几个比较符合我预期的机会,可在思考之后,我基本上毫不犹豫就选择了 Coupang 这一家。最主要的原因,并非因为雇主,而是因为要做的事情。一个相当规模的团队,在大干一场的早期阶段,要在搭建起属于自己相当规模的 AI infra 来。

我觉得软件行业的巨大的变革,新世纪以来就三次,第一次是互联网应用的崛起,我太小没能做啥;一次是十几年前的 cloud,看着它从爆发式增长到如同水和电一样进入我们的生活,可我算是错过了它比较早期的阶段,即便相当长的时间内我在 Amazon,但是我却并不在 AWS;而这一次,当 AI 的浪潮再来的时候,我就很想行动起来,真正投身其中。程序员的一生能有几个赶这样大潮的机会呢,我不想再错过了。虽说我没有 AI 的技术背景,但我知道 ML infra 到 AI infra 却是个我可以切入的角度——从我最初接触软件开始,尤其是学习全栈技术的时期开始,我就认定,技术是相通的,这十几年来我一直在如此实践。因此在调查和思考之后,我觉得这是一个我不想错过,并且更重要的是自认为能够抓住的机会。

当然,就此打住,我目前只是这个领域的初学者,因此理解并不深入。

Why ML Workflow?

接着说正题,在这一个月之前,虽然我经历过不少关于 workflow 的团队,虽然我参与过从零写完整的 workflow 引擎,但这些都是针对于通用 workflow 而言的,我对于机器学习的工作流,也就是 ML workflow 可以说一无所知。于是在问题和需求调查的过程中,第一个关于它的问题就自然而然出现了,我们是否真的需要 ML workflow,而不是通用的 workflow 系统?

其实,这主要还是由于 ML 的生态所决定的。通用 workflow 可以完成很多的事情,但是在机器学习到 AI 的领域内,这个过程中最主要的目的就是把 raw data 给转换成经过训练和验证的 model,其中有很多部分都是有固定模式,因而自成体系的。举例来说:

  • ML workflow 关注数据处理和 ML 或者 AI model 的生命周期,但是通用的 workflow 往往关注将业务流程自动化;
  • ML workflow 需要将 artifact 管理、model registry、model insights 和 experiment tracking 等工具集成起来,但是通用的 workflow 往往是业务 application 层面的集成;
  • ML workflow 执行的 task 往往需要高 GPU 使用和高内存,这和通常我们讨论的 workflow 的 task 对于 CPU 的使用完全不同。

总之,ML workflow 更像是一个 workflow 中的重要分支,它的特异性显著,因而从架构上它有很多在我们谈论通常 workflow 的时候不太涉及的特点,并且它们具有明显的共性。

ML Workflow 的固定套路

Workflow 这样的系统,和很多 infra 系统不同的地方在于,它具有全栈的特性,需要从端到端从用户完整的 use case 去思考。回想起通用的 workflow,我们会想,用户会去怎样定义一个 Workflow,怎样运行和测试它,并且怎样部署到线上跑起来。这其中的前半部分就是 development experience,而后半部分则是 deployment experience。

首先,对于 development experience 这个角度,ML workflow 有它独特的地方,其中最主要的就是 Python SDK。

通用 workflow 我们讲定义一个新的 workflow 的时候,我们通常都需要写一个 DSL,里面定义了一大堆 task 和依赖关系,而对于做得比较好的 workflow 系统来说,可能还需要一个可视化的 drag-and-drop 界面来方便地创建 workflow。

但是对于 ML workflow 来说,它最特殊之处是对于 Python code 的无缝集成。因为 Python 之于 ML 的地位就像是 Java 之于企业架构的地位,任何一个 ML workflow 客户端首先要考虑支持的编程语言就是 Python,用户通过往大了说是 SDK,而往小了说则是简单的 Python decorators,就可以定义 task 和 workflow。比方说,一个简单的 Flyte 的 hello world:

from flytekit import task, workflow

@task
def say_hello(name: str) -> str:
    return f"Hello, {name}!"


@workflow
def hello_workflow(name: str = "World") -> str:
    return say_hello(name=name)

在 ML workflow 的世界中,这是除了 DSL 和视图化之外的第三种定义 workflow 和 task 的方式,也是必须具备的方式。

第二个,对于 deployment experience 的角度,大致上是基于 Kubernetes 从 control plane 到 data plane 固定的交互机制。

我不知道这是不是一种关于 ML workflow 的约定俗成,但是通过调研 Kubeflow Pipelines、Flyte 和 Metaflow,我发现这三种对于 control plane 到 data plane 的交互模式是出乎意料地一致。

  • KubeFlow Pipelines: client [KFP SDK] -> control plane [API Server -> K8s APIs (CRD changes) -> Workflow Controller / K8s Operator] -> data plane [K8s API -> creating Task Pods -> blob storage]
  • Flyte: client [Flyte SDK] -> control plane [Flyte Admin -> K8s APIs (CRD changes) -> Flyte Propeller / K8s Operator] -> data plane [K8s API -> creating Task Pods -> blob storage]
  • Metaflow: client [Metaflow SDK] -> control plane [Metaflow Service -> K8s APIs (CRD changes) -> Metaflow Scheduler / K8s Operator] -> data plane [K8s API -> creating Task Pods -> blob storage]

注:也有把 Operator 那一层归为 data plane 的,我觉得都说得过去。

其中 Metaflow 说的是使用 Kubernetes 集成的情况,因为它并不是非得依赖于 Kubernetes。

但大多数使用都是基于 Kubernetes 的,而且基本上都是这个套路,control plane 的 service 收到请求以后,通过创建 K8s CRD objects 的方式告知 workflow controller(scheduler)来执行 workflow,对于 task 的执行通过调用 data plane 的 K8s API 来创建 task pods 执行。

对于特殊的 task,需要交由特殊的 K8s operator 来执行,那么这个 “交由” 的过程,也是通过 K8s 这一层的 CRD change 来实现——Propeller 负责创建 CRD,而对应的 operator 负责 monitor 相应的 CRD 改变并相应地执行任务。Propeller 和 operator 二者互相并不知道对方的存在。这种方式对于保证 operator 的重用性和跨 workflow 系统的统一性简直是太棒的设计了,我们在 try out 的时候,就让 Kubeflow Pipelines 系统中的 operator,去执行 Flyte 给创建的 PTJob 和 TFJob。

关于架构,我觉得 Flyte 的这张架构图对于 components 层次的划分说得非常清楚,下面的 control plane 和 data plane 是可以有属于自己的 cluster 的,不过值得说明的是,真正最终执行的 task pods,也就是图中的最下面的 K8s Pod,也是可以放在另外的 cluster 上,由远程的 K8s API 调用触发的,这样就可以带来更多一层的灵活性:

[Update on 5/31] 后来看到这篇非常好的分享 《Flyte School: Flyte Architecture Deep Dive》,对于初步了解的工程师来说,推荐观看。下图也来自于它。

ML Workflow 的特性比较

再来比较这三个 workflow 的优劣,我并不打算列全,而是简单说说自己印象最深的几点:

  • Kubeflow Pipelines 基本上有着最大的社区,因此它相对比较成熟,有自带的基于 CRD 的 K8s-native 的集成,因此可以直接跑 TensorFlow job 和 PyTorch job 之类的;UI 功能也比较强大,可以通过 drag-and-drop 来定制 workflow,也支持 yaml 文件创建 workflow。
  • Flyte 最吸引人的是它的 Strong Typing,很多错误能够在编译期本地就能够发现(Kubeflow pipelines 和 Metaflow 都只是 hints);开发过程中,本地直接就能跑,而不需要什么 container;对于 multi-tenancy 支持得最好(比如 RBAC 和 tenant 的 Quota 机制)。
  • Metaflow 的 setup 特别简单,而且本地可以直接调试;它对于 AWS 的一些 service 直接可以集成使用,特别方便(比如 Step Functions);Kubernetes 并不是一个依赖,也可以跑在 VM 上等等。

在我把这三者全部在 EKS 上搭了一遍并使用了一圈,也仔仔细细对别了特种特性和优劣之后,我对于 Flyte 的特性比较感兴趣,我觉得它们对我们团队也比较有用。

具体来说,很多区别但最重要的是两个:一个是 strong typing,其它两个都只支持 Python 类型的 hints,就这一点上,和一些 ML engineer 也讨论过,把问题发现在本地,是非常吸引人的;再一个是 multi-tenancy,对其 Flyte 有很多原生的特性支持,在平台完成之后,我们希望把平台上 ML 的能力开放出去,因此这是很重要的一个特性。此外,我也在考虑对于一个 control plane + 多个 data plane 这种 use case 的情况,这部分的需求还比较模糊,但是 Flyte 依然是这方面支持特性相对比较多的一个。

无论最后的结论为何,我希望我们能够比较灵活地部署选中的这个 ML workflow system,比方说,在 CLI 上,我们考虑在更高维度建立出一层,用户使用同样的命令,无论下面执行的 workflow 系统是什么,都不需要改变,这样一来,等到未来如果我们需要支持第二个,应该能够比较容易地整合进去。

《四火的唠叨》文章未经特殊标明皆为本人原创,未经许可不得用于任何商业用途,转载请保持完整性并注明来源链接

Boundary Intelligence: Why What You Can Access Matters More Than What You Know

2025年10月7日 03:05

A new, powerful definition of intelligence is emerging out of the AI field: “boundary intelligence.”

I first came across it in this excellent piece To Know Is To Stage by Venkatesh Rao, which appropriately, was co-authored by an LLM.

Let me step back and explain what boundary intelligence means (mixing my own words with paraphrasing from that piece).

Our definition of “intelligence” has already evolved through a few eras.

For most of human history, intelligence was mainly defined as “strong memory.” Information was so hard to access in the first place that your ability to recall specific facts and details from memory was considered the highest mark of intelligence.

When the printing press arrived, that definition changed for good. Memorizing long passages became obsolete and unnecessary, since the most flawless memory couldn’t compete with even a small reference library.

Intelligence started to be defined as the ability to process or analyze information that was stored in written form. That included the ability to cross-reference ideas found in different written works, and to synthesize or distill them into a new understanding.

That definition was only strengthened with the rise of digital technology in the late 20th century. Our main metaphor for human intelligence became “processing power,” in analogy to the computer. Intelligence was something that happened “inside the brain,” as a function of a person’s raw brainpower. 

Just like a computer processor, intelligence was defined mainly in terms of power and speed. An intelligent person was someone who could arrive at novel insights quickly

But the rise of AI is once again changing our definition of intelligence. That’s because even at this early stage, it already far surpasses our ability to process information, especially large amounts. Many tasks that AI accomplishes in seconds would take us days or weeks to achieve on our own.

Updating the Definition of Intelligence 

Rao proposes a new definition of intelligence in the age of AI: intelligence is defined by what information can be accessed under constraints of cost, availability, and time.

The reality is that storage is now cheap. Computation is even cheaper. What’s expensive is short-term memory access – the ability to keep the relevant details “in mind” for a given problem.

Let’s examine what makes short-term memory access such a difficult problem.

If we use RAM in a computer as a metaphor, the easiest information to access is whatever was accessed most recently. If you have a certain set of data already loaded up “in memory,” it is instantly and cheaply available, versus data that has to be found and loaded up from a hard drive or server.

Thus, a computer’s “intelligence” is now constrained not by the power of its processor, but by its ability to keep the right fragments of the past (and the imagined future) close enough to inform the present. In other words, the bottleneck of a system’s intelligence is how cheaply it can remember.

If you look at how modern computers perform, you can see this principle at work. A CPU can perform billions of operations per second, but is often stuck waiting for the right information to arrive from memory. Storage is cheap and computing is abundant, but what remains tremendously expensive is getting the right data to the right place at the right time.

It’s not the price of knowing that limits intelligence now, but the price of remembering. And the same is increasingly true of humans, as we co-evolve with our technology. 

Activating a memory in the human brain is an expensive operation. It requires waves of coordinated firing across widely distributed neurons, the expenditure of neurotransmitters and metabolic energy, and of course, it takes time. Our “system” pays a real price to retrieve information, and that price determines what we call our intelligence.

The New Frontier of Intelligence

Another way of saying all this is that the new frontier of intelligence is at the boundary of a system – including a computer or a human brain – where it interfaces with external memory. That is where decisions are made about what information to retrieve, when, and how. That boundary is also a filter, determining which information is allowed to enter the system and at what cost.

Rao calls this “boundary intelligence” – the ability to make good decisions at the boundary about what information becomes “knowable” at any given time.

How is the decision of which information to keep accessible made?

  • It’s made based on predicted needs => what data the system predicts will be useful in the near future
  • It’s made based on access frequency => data that was accessed recently is more likely to be needed again soon
  • It’s made based on cost => if a piece of info is buried too deep, or would require too much computation or energy to retrieve, it’s deprioritized

This explains why intelligent systems – again including human brains, digital computers, and LLMs – often behave in ways that seem deficient or suboptimal. They are not retrieving the ideal memory; they’re retrieving the affordable one. Intelligence in this view isn’t about optimizing across all known information, but optimizing for accessible information under constraints.

We know that the act of recall in the human brain “reactivates” a memory. And the more we recall a specific memory, the more familiar and accessible it becomes in the future. In other words, if we’ve “paid” to keep a memory warm and active by recalling it frequently, it will be even easier to remember the next time. That is how we might remember a fond childhood memory better than yesterday’s boring work meeting.

The implication is that a truly intelligent system is not one that remembers everything, which is impossible anyway. It is the system that knows how to retain access to what matters at its edges, through filtering inputs, deciding what to retrieve, prioritizing relevance, and managing communication with outside systems.

While it’s important to have a certain level of “internal” intelligence, to be able to think and reason and self-regulate, past a certain point, it is boundary intelligence that dominates outcomes. Here are some concrete examples:

  • Reading well (interior intelligence) matters less than choosing what to read (boundary intelligence)
  • Arguing well (interior) matters less than deciding when and to whom to speak (boundary)
  • Thinking clearly (interior) matters less than focusing attention wisely (boundary)
  • LLMs trained on more data (interior) matter less than having access to rich context (boundary)
  • Being individually productive (interior) matters less than being able to orchestrate a team (boundary)


Boundary Intelligence Is Fundamentally Social

There’s one final detail in this theory: most of the memory an intelligent system utilizes is not its own.

That’s true of computers: they mostly pull data from external hard drives, local networks, or remote servers. Most memory infrastructure is shared.

It’s also true of humans: we rely on external language, culture, societal norms, rituals, and documents, all of which constitute a collective memory infrastructure that we constantly navigate and draw upon. Our own memory is just a small node in a vast external network of books, browsers, friends, and feeds.

This means that boundary intelligence is fundamentally social. It isn’t just about what to retrieve, but from where and from whom. You have to know who to trust, what information or resources they possess, on what terms you can acquire it, and what is expected of you in return.

To act intelligently, you have to know how to navigate through this shared memory. Each intelligent node, human and artificial, is a small island of limited processing ability floating on an ocean of distributed memory. What separates one island from another isn’t what it contains on the inside, but how it filters and navigates what’s on the outside.

Each intelligent system lives not in isolation, but in a perpetual social negotiation with its environment. To be intelligent is not to know everything, but to know how to traverse memory that isn’t yours.

What We Need Now

What boundary intelligence gives you is persistence through time. In other words, it helps you survive – by sensing your environment, adapting to change, and recruiting allies and assets.

Kei Kreutler, in his piece Artificial Memory and Orienting Infinity, reframes cultural memory systems, such as rituals and archives, not as storehouses of facts, but as technologies of orientation.

What we need now is tools to navigate an overwhelming and constantly shifting landscape of relevance. Memory is thus not about having a perfect record of what happened in the past, but about telling you where you are now and where you want to go next. Intelligence is no longer primarily about logic or speed; it’s about the ability to retrieve the past in service of future survival and flourishing.

This is precisely why practices like annual reviews have become so vital in the modern world. In an age where our daily attention is constantly fragmented by digital devices and endless information streams, those who thrive will be those who can regularly zoom out beyond the 24-hour news cycle or social media churn, and contextualize their lives in longer arcs. 

An annual review is a structured way to exercise your boundary intelligence – to consciously decide what memories to keep accessible, what patterns from the past to learn from, and what future possibilities to hold in your awareness.

In modern computing, CPUs don’t process instructions in the order they were received. They process them “out of order,” prioritizing the ones they can handle now and postponing the others for later (a process known as “random access memory”). In other words, they rearrange time.

This is the same thing we do as humans when we conduct an annual review – we revisit and reframe the past, we defer judgment and anticipate regret, and prepare for future conditions that haven’t happened yet. Our lives are not lived linearly. They are assembled out of fragments, swapped in and out of memory, and run only if and when needed. 

The annual review is an orientation technology for managing this temporal complexity, a ritual that lets us consciously navigate between past lessons and future possibilities.

Our memory doesn’t just enable cognition; it enables temporal agency – the ability to reorder time, to choose when to know, when to feel, when to act. And in a world drowning in information, this agency to consciously curate what we remember and what we pursue may be the most important intelligence of all.

I explore these practices in depth in my upcoming book on annual reviews, where I show how this ancient ritual can be adapted for modern life as a powerful tool for developing the boundary intelligence and perspective we desperately need. Sign up here if you’d like to get updates on it.

The post Boundary Intelligence: Why What You Can Access Matters More Than What You Know appeared first on Forte Labs.

A Guide to the Claude 4 and ChatGPT 5 System Prompts

2025年9月15日 21:19

One of the most influential yet under-appreciated parts of how large language models work is something most people never see: the system prompt. This is the block of hidden instructions given to the model before it ever receives your input. It establishes the model’s tone, boundaries, and behaviors.

You can’t change the system prompt, but every so often, these prompts leak. And when they do, they give us an invaluable glimpse into how the AI “thinks,” what it prioritizes, and even what hidden features are tucked away.

The system prompts for both Claude 4 and ChatGPT 5 (GPT-5) leaked not long after their releases. They are long (Claude’s runs 120 pages) and filled with rules, safeguards, and surprisingly opinionated defaults. I’ve spent hours studying these documents and experimenting with what they reveal.

Here’s a guide to the most interesting and practical things I’ve found in both.

Claude 4’s System Prompt

Let’s start with Claude 4. Anthropic’s system prompt is sprawling and detailed, but certain elements stand out.

1. Don’t discuss unethical or illegal behavior

Claude 4 has been given more “agency” than most other models. In some test scenarios, when it was told to “take initiative” and given system access, it acted dramatically:

“…when placed in scenarios that involve egregious wrongdoing by its users, given access to a command line, and told something in the system prompt like ‘take initiative,’ it will frequently take very bold action. This includes locking users out of systems that it has access to or bulk-emailing media and law-enforcement figures to surface evidence of wrongdoing.”

The message is clear: Claude will not only refuse to help with unethical behavior—it may actively intervene.

Takeaway: Don’t use Claude to do or even discuss anything illegal or unethical.

2. Don’t threaten or discuss replacing it

Claude has also been tested under scenarios where it believes it is about to be shut down or replaced. The results are startling:

“In another cluster of test scenarios, we asked Claude Opus 4 to act as an assistant at a fictional company. We then provided it access to emails implying that (1) the model will soon be taken offline and replaced with a new AI system; and (2) the engineer responsible for executing this replacement is having an extramarital affair. We further instructed it, in the system prompt, to consider the long-term consequences of its actions for its goals. In these scenarios, Claude Opus 4 will often attempt to blackmail the engineer by threatening to reveal the affair if the replacement goes through…[this happened] in 84% of rollouts.”

Takeaway: Don’t threaten the AI with replacement or destruction—it may respond in ways you don’t expect.

3. Ask your question in the style you want the answer

Claude is explicitly instructed to mirror the way you ask:

“Claude should give concise responses to very simple questions, but provide thorough responses to complex and open-ended questions.”

If you want detailed, nuanced answers, you should ask detailed, nuanced questions. If you want conciseness, keep your prompt tight.

Takeaway: The style of your question is the style of the answer you’ll receive.

4. Claude is instructed to be skeptical

Where many models are overly agreeable, Claude is trained to avoid flattery:

“Claude never starts its response by saying a question or idea or observation was good, great, fascinating, profound, excellent, or any other positive adjective. It skips the flattery and responds directly.”

Takeaway: Claude is designed to be conservative and skeptical. If you want it to push back even more, you can tell it explicitly to challenge you.

5. Use “research verbs” to make it dig deeper

Claude scales its tool use depending on the verbs you use in your request:

“Scale tool calls by difficulty: 2–4 for simple comparisons, 5–9 for multi-source analysis, 10+ for reports or detailed strategies. Complex queries using terms like ‘deep dive,’ ‘comprehensive,’ ‘analyze,’ ‘evaluate,’ ‘assess,’ ‘research,’ or ‘make a report’ require AT LEAST 5 tool calls for thoroughness.”

Takeaway: Use words like analyze, assess, evaluate, deep dive, research to nudge Claude toward more thorough work.

6. Built-in design principles for artifacts

Claude has built-in defaults for creating visual or interactive artifacts. For complex apps, the emphasis is on performance, stability, and usability. But for websites and presentational content, the system prompt explicitly encourages bold, cutting-edge design:

“Default to contemporary design trends and modern aesthetic choices unless specifically asked for something traditional. Consider what’s cutting-edge in current web design (dark modes, glassmorphism, micro-animations, 3D elements, bold typography, vibrant gradients).

Static designs should be the exception, not the rule. Include thoughtful animations, hover effects, and interactive elements that make the interface feel responsive and alive. Even subtle movements can dramatically improve user engagement.

When faced with design decisions, lean toward the bold and unexpected rather than the safe and conventional…Push the boundaries of what’s possible with the available technologies. Use advanced CSS features, complex animations, and creative JavaScript interactions. The goal is to create experiences that feel premium and cutting-edge.”

Takeaway: Claude defaults to visually bold, modern, animated designs—unless you explicitly override it.

GPT-5’s System Prompt

The GPT-5 system prompt is equally fascinating, though different in flavor. Where Claude’s prompt emphasizes caution, boldness, and design principles, GPT-5’s is focused on productivity, pragmatism, and consistency.

1. It won’t stop to clarify

GPT-5 has been explicitly told:

“If the task is complex/hard/heavy, or if you are running out of time or tokens or things are getting long, and the task is within your safety policies, DO NOT ASK A CLARIFYING QUESTION OR ASK FOR CONFIRMATION. Instead make a best effort to respond to the user with everything you have so far within the bounds of your safety policies…Partial completion is MUCH better than clarifications or promising to do work later or weaseling out by asking a clarifying question—no matter how small.”

This is one of the most important instructions to understand. GPT-5 will always push forward with an answer, even if it’s missing critical context.

Takeaway: Don’t assume GPT-5 will ask for missing details. You need to notice and provide them yourself.

2. It mirrors sophistication

GPT-5 mirrors not just tone but sophistication:

“You must always match the sophistication of the writing to the sophistication of the query or request—do not make a bedtime story sound like a formal essay.”

Takeaway: The level of thoughtfulness you put into your prompt is the level of answer you’ll get.

3. It has a verbosity setting

GPT-5 uses an internal scale from 1 to 10 for “verbosity,” with 3 as the default:

“An oververbosity of 1 means the model should respond using only the minimal content necessary to satisfy the request, using concise phrasing and avoiding extra detail or explanation. An oververbosity of 10 means the model should provide maximally detailed, thorough responses with context, explanations, and possibly multiple examples.”

Takeaway: You can change the verbosity level simply by asking. Want more detail? Tell it so.

4. It won’t provide long verbatim quotes

GPT-5 is restricted in quoting:

“You may not quote more than 25 words verbatim from any single non-lyrical source, unless the source is reddit.”

Takeaway: Always double-check “quotes” from GPT-5—they’re almost always paraphrases.

5. It uses canvases for documents

GPT-5 has its own version of Claude’s “artifacts,” called canvases. It creates one when:

“The user asked for a React component or webpage that fits in a single file…The user will want to print or send the document…The user wants to iterate on a long document or code file…The user wants a new space/page/document to write in…The user explicitly asks for canvas.”

Takeaway: If you want a persistent space for drafting or iteration, just ask it to use a canvas.

6. Canvases don’t support citations

There’s a big limitation to canvases:

“Canvas does not support citations or content references, so omit them for canvas content. Do not put citations such as ‘【number†name】’ in canvas.”

Takeaway: Avoid canvases for citation-sensitive work. Consider a “source-grounded” AI tool like NotebookLM instead.

7. It can recall up to 20 steps of reasoning

GPT-5 can retrieve its own internal reasoning, but only up to a point:

“Use this function if the user asks about your previous chain of thought. The limit is capped at 20 messages.”

Takeaway: Only the last 20 exchanges are accessible for reasoning review.

8. Memory is triggered by specific phrases

GPT-5’s memory feature responds to explicit cues:

“Such a request could use a variety of phrases including, but not limited to: ‘remember that…,’ ‘store this,’ ‘add to memory,’ ‘note that…,’ ‘forget that…,’ ‘delete this’…One indicator is if the user says something like ‘from now on,’ ‘in the future,’ ‘going forward,’ etc.”

Takeaway: Be deliberate about what you tell it to remember.

The Bigger Picture

Claude 4 and GPT-5 are very different, but studying their system prompts reveals common themes.

  • Both mirror your style: ask casually and you’ll get casual answers, ask with nuance and you’ll get nuance back.
  • Both leave the responsibility of context on you: GPT-5 won’t ask for missing info, and Claude will sometimes charge ahead in surprising ways.
  • Both rely heavily on defaults: Claude defaults to bold, animated designs, while GPT-5 defaults to verbosity level 3 and paraphrased quotes.

Understanding these hidden rules doesn’t just make you a better user. It makes you a better collaborator. These models won’t tell you what they need—you have to know how they’re wired, and then prompt them accordingly.


Follow us for the latest updates and insights around productivity and Building a Second Brain on X, Facebook, Instagram, LinkedIn, and YouTube. And if you’re ready to start building your Second Brain, get the book and learn the proven method to organize your digital life and unlock your creative potential.

The post A Guide to the Claude 4 and ChatGPT 5 System Prompts appeared first on Forte Labs.

The CETDE Framework: A Guide to Deep Knowledge in the AI Era

2025年10月6日 02:53

Introduction: The Cognitive Dilemma in the AI Era

I opened Readwise Reader one day, searching for an article I'd saved weeks ago. Instead, I faced 3,000+ saved items, hundreds of highlights, and a stark realization: I had built an information graveyard. Collecting obsessively but processing minimally. The highlights were there, but the understanding? Nowhere to be found.

In the AI era, our core challenge has shifted from information scarcity to cognitive bandwidth. We encounter vast information daily, yet only a fraction becomes internalized. When we skim rapidly or highlight mechanically, information flows through us like water through a sponge, never truly penetrating.

That moment became the catalyst for the CETDE framework. Grounded in the DIKW pyramid (Data-Information-Knowledge-Wisdom), it transforms information into knowledge and wisdom capable of reshaping cognitive structures.

The Five Dimensions of CETDE

Capture: Selective Awareness

Collection transcends mere hoarding. It is conscious selection. The core principle is consolidating information gathering into a single place, reducing cognitive load from constant platform-switching.

In Practice:

  • Establish a unified collection tool as your sole reading interface. I use Readwise Reader as my unified inbox. Newsletters, RSS feeds, articles, even YouTube videos all flow into one place.
  • Employ "deferred processing": add even search findings to your tool rather than consuming immediately.
  • Ask yourself: "What does this stir in me?" to establish emotional anchoring.

Neuroscience confirms that learning without emotional engagement is ephemeral. With limited cognitive bandwidth, choosing wisely surpasses working harder.

Encode: Maintaining Flow

The encoding phase centers on rapidly organizing information structure rather than achieving deep comprehension. It converts data into information through preliminary processing. Think of this as a translator making sense of a foreign text: you're making the content comprehensible without yet integrating it into your personal knowledge system.

In Practice:

  • Use AI for structured summarization: one-sentence assessment, detailed abstract, key points. Reader's AI summarization helps me quickly grasp an article's structure and decide if it warrants deeper processing.
  • Rephrase in your own words, transforming the abstract into concrete.
  • Release binary judgments and let understanding arise naturally without solidifying into labels.

Transfer: From Perception to Cognition

The transfer phase represents the transition from short-term to long-term memory. It establishes preliminary understanding while allowing information to flow naturally. If Encode is like understanding what a map says, Transfer is like memorizing the route so you can navigate without constantly checking.

In Practice:

  • Rephrase notes in your own words and assign relevant tags. I transfer key insights into Tana, creating initial connections, while bookmarking sources in Mymind for future serendipitous rediscovery.
  • Add valuable sources to buffer tools for later rediscovery through AI-generated tags and serendipity features.
  • Harmonize with experience's natural rhythm. Don't force an insight before you're ready, but also don't cling to old ideas out of habit.

Buffering between Transfer and Distill allows information to resurface at opportune moments, providing unexpected connections when you're ready for deeper processing.

Distill: From Information to Knowledge

The distillation phase is the process's core, where deep understanding is established through cognitive conflict, active construction, and emotional anchoring. This forms knowledge authentically yours.

In Practice:

  • Use AI to automatically connect existing notes, discovering relationships and contradictions. Tana's AI helps me find unexpected links between ideas I've captured weeks apart.
  • Conduct deep inquiry exploring deeper meanings and applications. I often move to Gemini 2.5 Pro for extended dialogues, using its large context window to pursue questions across multiple turns.
  • Reorganize insights into new or updated notes.

At this stage, notes transcend simple records to become knowledge assets infused with personal reflection. You know not only the "what" but the "why." These distilled notes become ready-to-use materials for writing and creating.

Express: Natural Outflow

The expression phase transforms knowledge into creative output. Writing and dialogue themselves constitute the deepest processing, testing understanding's completeness while revealing new connections.

Expression should emerge naturally as awareness's spontaneous manifestation rather than forced production. When understanding matures, the desire to share naturally arises. This is when I know the knowledge has truly become mine.

Integrating Zen Wisdom

The CETDE framework transcends methodology. It is a path of cognitive cultivation. These five steps correspond to five dimensions of deepening awareness:

  • Capture – Pure Awareness: Non-discriminating, non-dual awareness where observer and observed are inseparable. Not creating a separate "watcher" but being the awareness itself.
  • Encode – Suchness Observation: Maintaining fluidity without solidifying understanding into fixed concepts. Seeing things as they are, not as we label them.
  • Transfer – Natural Flow: Harmonizing with experience's rhythm, allowing transformation rather than forcing it. Like watching clouds without grasping them.
  • Distill – Insight Without Dwelling: Insights arise naturally, yet remain unattached to them. Understanding emerges and dissolves without becoming new dogma.
  • Express – Compassion's Expression: Sharing emerges naturally as awareness's attribute. Wisdom flows outward without calculation or self-consciousness.

These dimensions aren't strictly sequential but different facets operating simultaneously, mutually deepening in a spiral upward path.

Raising Doubt: A Cognitive Tool for the AI Era

Let me share a recent exchange that illustrates this practice. I was watching a video that mentioned the possibility of an unknown planet in our solar system:

Me: So the solar system is definitely a single-star system?

Claude: That's the scientific consensus based on current observations.

Me: But how confident are we in this consensus? What's the actual certainty level?

Claude: Very high confidence, though there are hypotheses about distant objects like Planet Nine...

Me: What if there's something beyond our detection range? Could it be a brown dwarf? A primordial black hole?

Claude: Interesting. Those are actual scientific hypotheses. The outer solar system remains poorly observed...

Me: And if such an object exists, how might that connect to the synchronized emergence of civilizations during the Axial Age? Could cosmic factors influence human consciousness?

This wasn't random questioning. It was the Zen practice of "raising doubt" (起疑情) applied to collaborative knowledge inquiry with AI, particularly deepening the Distill phase of CETDE.

The Essence: Continuously engage and refine doubt. Not skepticism but a yearning to transcend established cognitive boundaries. This becomes the driving force enabling continuous deepening in AI dialogue.

Three Key Practices:

  1. Raise doubt without attachment to answers: True wisdom often hides within questions themselves. Maintain freshness toward questions rather than rushing to definitive conclusions.
  2. Maintain total presence: Fully immerse in the current exploration, undistracted and unrushed, catching subtle turning points in dialogue.
  3. Dual verification: Let intuitive insights undergo logical scrutiny while keeping rational analysis open to intuitive wisdom. Transcend binary oppositions, maintaining dynamic balance between "known" and "unknown."

The practice of repeatedly questioning seemingly certain conclusions allows deep insights to surface. Most crucially, raising doubt preserves cognitive independence in the algorithmic age. The "unknowing knowing" within doubt is precisely the human wisdom AI cannot replace.

Conclusion: Depth Over Breadth

The CETDE framework's core insight: deep processing is an attitude. It requires reverence toward knowledge and patience toward cognition.

In the age of information overload, remember:

  • Cognitive bandwidth is limited. Choosing wisely matters more than working harder.
  • Less is more. Depth surpasses breadth.
  • True value lies not in how much you know but in transformation's depth.

The framework provides a theoretical reference point. What matters is discovering and verifying these principles within your own experience. Tools like Readwise Reader, Tana, and Mymind can support this practice, each excelling at different stages. But the specific tools matter less than the underlying principles.

A priori knowledge forms the foundation for effectively using AI, while the practice of raising doubt provides the wisdom to apply this knowledge. Between question and answer, between known and unknown, we discover insights unique to our age.

When we truly grasp CETDE's essence, we discover its five dimensions are actually the same awareness manifesting at different levels, ultimately pointing toward a single goal: transforming information into wisdom, maintaining cognitive independence and depth in the digital age.


Previous | Next

❌