Issue
Codex is continuously writing a large amount of data to the local SQLite feedback log database:
~/.codex/logs_2.sqlite~/.codex/logs_2.sqlite-wal~/.codex/logs_2.sqlite-shm
On my machine, after about 21 days of uptime, the main SSD has written about 37 TB. Process/file-level checks show Codex SQLite logs are the main continuous writer.
That extrapolates to roughly 640 TB/year. On a 1 TB SSD, that is about 640 full-drive writes per year. Some consumer SSDs are rated around 600 TBW, so this could consume roughly a full drive’s warranted write endurance in less than a year.
Evidence
Current retained rows in logs_2.sqlite:
| metric | value |
|---|---|
| retained rows | 681,774 |
| estimated retained log content | 1,035.6 MiB |
Level distribution:
| level | estimated MiB | byte % |
|---|---|---|
| TRACE | 732.5 | 70.7% |
| INFO | 266.5 | 25.7% |
| DEBUG | 30.6 | 3.0% |
| WARN | 5.9 | 0.6% |
Largest target+level pairs:
| target | level | estimated MiB |
|---|---|---|
codex_api::endpoint::responses_websocket |
TRACE | 527.4 |
codex_otel.log_only |
INFO | 141.2 |
codex_otel.trace_safe |
INFO | 121.2 |
log |
TRACE | 97.4 |
codex_client::transport |
TRACE | 60.1 |
codex_core::stream_events_utils |
DEBUG | 27.5 |
codex_api::sse::responses |
TRACE | 19.1 |
The top sources are mostly global TRACE logs, mirrored telemetry logs, and raw websocket/SSE payload logging. TRACE alone is about 70.7% of retained bytes. codex_otel.log_only + codex_otel.trace_safe add another 25.3%. Filtering these categories should remove roughly 96% of retained log bytes in this sample without fully disabling feedback logs.
Sanitized examples from the most frequent TRACE source: target=log
These are high-frequency retained samples. Raw websocket/SSE payload bodies are intentionally not included because they may contain private conversation content.
128,764x TRACE log: inotify event: ... mask: OPEN, name: Some("ld.so.cache")
37,982x TRACE log: inotify event: ... mask: OPEN, name: Some("locale.alias")
23,843x TRACE log: inotify event: ... mask: OPEN, name: Some("passwd")
3,639x TRACE log: <tokio-tungstenite checkout>/src/compat.rs:131 AllowStd.with_context
3,505x TRACE log: <tokio-tungstenite checkout>/src/lib.rs:245 WebSocketStream.with_context
3,362x TRACE log: <tokio-tungstenite checkout>/src/compat.rs:154 Read.read
3,356x TRACE log: <tokio-tungstenite checkout>/src/compat.rs:157 Read.with_context read -> poll_read
3,230x TRACE log: <tokio-tungstenite checkout>/src/lib.rs:294 Stream.poll_next
3,227x TRACE log: <tokio-tungstenite checkout>/src/lib.rs:304 Stream.with_context poll_next -> read()
3,213x TRACE log: inotify event: ... mask: OPEN, name: Some("nsswitch.conf")
2,001x TRACE log: WouldBlock
1,217x TRACE log: Masked: false
1,169x TRACE log: Opcode: Data(Text)
1,169x TRACE log: First: 11000001
Sanitized examples from frequent INFO sources
The dominant INFO sources are mostly repeated OpenTelemetry mirror events. IDs are redacted.
843x INFO codex_client::custom_ca:
using system root certificates because no CA override environment variable was selected ...
334x INFO codex_otel.trace_safe:
session_loop{thread_id=<redacted>}:submission_dispatch{otel.name="op.dispatch.user_input" submission.id=<redacted> codex.op="user_input"}:turn{otel.name="session_task.turn" thread.id=<redacted> ...}
333x INFO codex_otel.log_only:
session_loop{thread_id=<redacted>}:submission_dispatch{otel.name="op.dispatch.user_input" submission.id=<redacted> codex.op="user_input"}:turn{otel.name="session_task.turn" thread.id=<redacted> ...}
332x INFO codex_otel.log_only:
session_loop{thread_id=<redacted>}:submission_dispatch{otel.name="op.dispatch.user_input_with_turn_context" submission.id=<redacted> codex.op="user_input_with_turn_context"}:turn{otel.name="session_task.turn" thread.id=<redacted> ...}
332x INFO codex_otel.trace_safe:
session_loop{thread_id=<redacted>}:submission_dispatch{otel.name="op.dispatch.user_input_with_turn_context" submission.id=<redacted> codex.op="user_input_with_turn_context"}:turn{otel.name="session_task.turn" thread.id=<redacted> ...}
Write amplification
The retained DB size hides the real write volume. In a 15-second sample:
| metric | before | after |
|---|---|---|
| retained rows | 681,774 | 681,774 |
| max row id | 5,003,347,015 | 5,003,383,226 |
About 36,211 rows were inserted in 15 seconds, while retained row count stayed flat. This suggests continuous insert-and-prune write amplification: rows are inserted, indexed, written to WAL, then pruned.
Likely cause
The SQLite feedback log sink is installed with a global TRACE default:
Targets::new().with_default(Level::TRACE)
This persists all targets at TRACE level by default, including dependency/internal logs and large raw protocol payloads.
Proposed fix
Keep feedback logs enabled, but narrow what is persisted by default:
- Do not use global TRACE for the SQLite feedback log sink.
- Drop or raise thresholds for low-value dependency noise, especially
target=log,hyper_util, tokio-tungstenite internals, inotify spam, and low-level OpenTelemetry SDK logs. - Avoid persisting full raw websocket/SSE payloads by default. Store summaries instead: event kind, duration, success/error, token usage, and payload byte length.
- Avoid persisting mirrored
codex_otel.log_only/codex_otel.trace_safeevents unless they are explicitly useful for feedback debugging. - Add a global logs DB size/write cap. Per-thread caps are not enough when many threads/processes exist.
An optional escape hatch such as sqlite_logs_enabled = false would still be useful, but the main fix should be better default filtering.



