docs: add symptom-first troubleshooting hub and deep runbooks (#11196)
* docs(troubleshooting): add symptom-first troubleshooting runbooks * docs(troubleshooting): fix approvals command examples * docs(troubleshooting): wrap symptom cases in accordions * docs(automation): clarify userTimezone missing-key behavior * docs(troubleshooting): fix first-60-seconds ladder order
This commit is contained in:
122
docs/automation/troubleshooting.md
Normal file
122
docs/automation/troubleshooting.md
Normal file
@@ -0,0 +1,122 @@
|
||||
---
|
||||
summary: "Troubleshoot cron and heartbeat scheduling and delivery"
|
||||
read_when:
|
||||
- Cron did not run
|
||||
- Cron ran but no message was delivered
|
||||
- Heartbeat seems silent or skipped
|
||||
title: "Automation Troubleshooting"
|
||||
---
|
||||
|
||||
# Automation troubleshooting
|
||||
|
||||
Use this page for scheduler and delivery issues (`cron` + `heartbeat`).
|
||||
|
||||
## Command ladder
|
||||
|
||||
```bash
|
||||
openclaw status
|
||||
openclaw gateway status
|
||||
openclaw logs --follow
|
||||
openclaw doctor
|
||||
openclaw channels status --probe
|
||||
```
|
||||
|
||||
Then run automation checks:
|
||||
|
||||
```bash
|
||||
openclaw cron status
|
||||
openclaw cron list
|
||||
openclaw system heartbeat last
|
||||
```
|
||||
|
||||
## Cron not firing
|
||||
|
||||
```bash
|
||||
openclaw cron status
|
||||
openclaw cron list
|
||||
openclaw cron runs --id <jobId> --limit 20
|
||||
openclaw logs --follow
|
||||
```
|
||||
|
||||
Good output looks like:
|
||||
|
||||
- `cron status` reports enabled and a future `nextWakeAtMs`.
|
||||
- Job is enabled and has a valid schedule/timezone.
|
||||
- `cron runs` shows `ok` or explicit skip reason.
|
||||
|
||||
Common signatures:
|
||||
|
||||
- `cron: scheduler disabled; jobs will not run automatically` → cron disabled in config/env.
|
||||
- `cron: timer tick failed` → scheduler tick crashed; inspect surrounding stack/log context.
|
||||
- `reason: not-due` in run output → manual run called without `--force` and job not due yet.
|
||||
|
||||
## Cron fired but no delivery
|
||||
|
||||
```bash
|
||||
openclaw cron runs --id <jobId> --limit 20
|
||||
openclaw cron list
|
||||
openclaw channels status --probe
|
||||
openclaw logs --follow
|
||||
```
|
||||
|
||||
Good output looks like:
|
||||
|
||||
- Run status is `ok`.
|
||||
- Delivery mode/target are set for isolated jobs.
|
||||
- Channel probe reports target channel connected.
|
||||
|
||||
Common signatures:
|
||||
|
||||
- Run succeeded but delivery mode is `none` → no external message is expected.
|
||||
- Delivery target missing/invalid (`channel`/`to`) → run may succeed internally but skip outbound.
|
||||
- Channel auth errors (`unauthorized`, `missing_scope`, `Forbidden`) → delivery blocked by channel credentials/permissions.
|
||||
|
||||
## Heartbeat suppressed or skipped
|
||||
|
||||
```bash
|
||||
openclaw system heartbeat last
|
||||
openclaw logs --follow
|
||||
openclaw config get agents.defaults.heartbeat
|
||||
openclaw channels status --probe
|
||||
```
|
||||
|
||||
Good output looks like:
|
||||
|
||||
- Heartbeat enabled with non-zero interval.
|
||||
- Last heartbeat result is `ran` (or skip reason is understood).
|
||||
|
||||
Common signatures:
|
||||
|
||||
- `heartbeat skipped` with `reason=quiet-hours` → outside `activeHours`.
|
||||
- `requests-in-flight` → main lane busy; heartbeat deferred.
|
||||
- `empty-heartbeat-file` → `HEARTBEAT.md` exists but has no actionable content.
|
||||
- `alerts-disabled` → visibility settings suppress outbound heartbeat messages.
|
||||
|
||||
## Timezone and activeHours gotchas
|
||||
|
||||
```bash
|
||||
openclaw config get agents.defaults.heartbeat.activeHours
|
||||
openclaw config get agents.defaults.heartbeat.activeHours.timezone
|
||||
openclaw config get agents.defaults.userTimezone || echo "agents.defaults.userTimezone not set"
|
||||
openclaw cron list
|
||||
openclaw logs --follow
|
||||
```
|
||||
|
||||
Quick rules:
|
||||
|
||||
- `Config path not found: agents.defaults.userTimezone` means the key is unset; heartbeat falls back to host timezone (or `activeHours.timezone` if set).
|
||||
- Cron without `--tz` uses gateway host timezone.
|
||||
- Heartbeat `activeHours` uses configured timezone resolution (`user`, `local`, or explicit IANA tz).
|
||||
- ISO timestamps without timezone are treated as UTC for cron `at` schedules.
|
||||
|
||||
Common signatures:
|
||||
|
||||
- Jobs run at the wrong wall-clock time after host timezone changes.
|
||||
- Heartbeat always skipped during your daytime because `activeHours.timezone` is wrong.
|
||||
|
||||
Related:
|
||||
|
||||
- [/automation/cron-jobs](/automation/cron-jobs)
|
||||
- [/gateway/heartbeat](/gateway/heartbeat)
|
||||
- [/automation/cron-vs-heartbeat](/automation/cron-vs-heartbeat)
|
||||
- [/concepts/timezone](/concepts/timezone)
|
||||
Reference in New Issue
Block a user