Properties

category: spec
tags: [security, logging, owasp, plan]
last_updated: 2026-03-18
confidence: high

Security Logging Plan

Addresses OWASP A09 finding: "No audit trail for auth events, ACL changes, wiki deletions."

Events to Log

Auth server (`app/auth_server.py`)

Event	Route / Location	Fields
`login.initiated`	`POST /auth/login` (after PAR submit)	actor=handle, ip
`login.success`	`oauth_callback()` after JWT issued	actor=did, handle, ip
`login.new_user`	`oauth_callback()` → redirect to signup	actor=did, ip
`signup.success`	`signup()` after `user_model.create()`	actor=did, username, ip
`consent.granted`	`_handle_consent_post()` action=approve	actor=did, wiki_slug, client_id, ip
`consent.denied`	`_handle_consent_post()` action=deny	actor=did, wiki_slug, client_id, ip
`logout`	`oauth_logout()`	actor=did, ip
`rate_limit.hit`	`ratelimit_handler()` (429)	ip, path

Management middleware (`app/management/routes.py`)

Event	Method	Fields
`wiki.created`	`_create_wiki()` return 201	actor=did, slug
`wiki.deleted`	`_delete_wiki()` return 200	actor=did, slug
`token.regenerated`	`_regenerate_token()` return 200	actor=did, slug
`rate_limit.hit`	429 block in `__call__()`	ip, method, path

Resolver (`app/resolver.py`)

Event	Location	Fields
`auth.bearer_invalid`	`_resolve_bearer_token()` raises AuthError 401	ip, wiki_slug
`auth.bearer_mismatch`	`_resolve_bearer_token()` raises AuthError 403	ip, wiki_slug
`rate_limit.hit`	429 block in `__call__()`	ip, wiki_slug

Not logged: ACL flag changes (allow_read, allow_write, is_admin, is_approved). These happen inside otterwiki's admin UI with no current hook point. Deferred — track as a follow-on once otterwiki lifecycle hooks or the per-wiki DB plan is in place.

Log Format

Structured JSON, one object per line, emitted via Python stdlib logging to stdout → systemd journal (already configured). No new log file or rotation needed — journal handles retention (30-day, 500MB cap per robot-journald.conf).

{
  "ts": "2026-03-18T12:34:56.789Z",
  "event": "login.success",
  "actor_did": "did:plc:abc123",
  "actor_handle": "user.bsky.social",
  "wiki_slug": null,
  "client_id": null,
  "outcome": "success",
  "ip": "1.2.3.4",
  "syslog_identifier": "robot-auth"
}

Fields:

ts — UTC ISO-8601
event — dot-namespaced string (see tables above)
actor_did — DID of the acting user, or null for anonymous/system
actor_handle — AT Protocol handle, or null
wiki_slug — target wiki, or null for platform-level events
client_id — OAuth client_id for consent events, else null
outcome — "success" | "failure" | "blocked"
ip — client IP (from request.remote_addr in Flask, or get_client_ip(environ) in WSGI middleware)

PII note: IP addresses are PII. They are logged for security purposes (rate limit forensics, abuse investigation). Journal retention is 30 days — no change needed. Do not log full handles in combination with IPs in any external/forwarded log sink.

Implementation Approach

New module: `app/audit.py`

A thin wrapper around stdlib logging. No new dependencies.

import logging, json
from datetime import datetime, timezone

_audit = logging.getLogger("robot.audit")

def log(event: str, *, actor_did=None, actor_handle=None,
        wiki_slug=None, client_id=None, outcome="success", ip=None):
    _audit.info(json.dumps({
        "ts": datetime.now(timezone.utc).isoformat(),
        "event": event,
        "actor_did": actor_did,
        "actor_handle": actor_handle,
        "wiki_slug": wiki_slug,
        "client_id": client_id,
        "outcome": outcome,
        "ip": ip,
    }))

The logger name robot.audit lets operators filter with journalctl -u robot-auth SYSLOG_IDENTIFIER=robot-auth | grep robot.audit or similar.

Call sites

app/auth_server.py — add from app import audit (or from app.audit import log as audit_log) and call audit_log(...) at the points noted in the events table. Most insertions are single lines immediately after the decision that determines the outcome. The ratelimit_handler() at the bottom of create_app() is the single place to cover all rate limit hits on the auth server.

app/management/routes.py — ManagementMiddleware.__call__() already has the 429 path. _create_wiki() and _delete_wiki() return tuples — log immediately before return. actor_did comes from user.user_did, ip from get_client_ip(environ) (already imported).

app/resolver.py — TenantResolver._resolve_bearer_token() raises AuthError on invalid/mismatched tokens — log before raising. The 429 block in __call__() already has client_ip. No logging in _permissions_for_user() — permission derivation is not itself an audit event.

IP extraction

Flask routes: request.remote_addr (ProxyFix is already wired in create_app())
WSGI middleware: get_client_ip(environ) from app.rate_limit (already imported in both routes.py and resolver.py)

No request ID correlation (yet)

Cross-service correlation (e.g., tracing a consent grant through auth → resolver → MCP) would require propagating a request ID header. Deferred — the audit events are service-scoped and the actor_did + ts + wiki_slug triple is sufficient for forensics at this scale.

Ansible Changes

No new service file changes needed. All services already emit stdout to the journal (StandardOutput=journal). The robot.audit logger will appear in the same journal unit as the service that emits it.
Log level configuration. Add AUDIT_LOG_LEVEL=INFO to ansible/roles/deploy/templates/robot.env.j2 and wire it in app/audit.py (logging.basicConfig or the app's existing logging setup). The default INFO level is correct; DEBUG should never emit audit records.
No logrotate role needed. ansible/roles/logging/ already configures journald with 30-day retention and 500MB cap — sufficient.
Future: if a SIEM or external log forwarder is added, journalctl -o json -u robot-auth -u robot-api -u robot-mcp can be tailed. No Ansible change required at that point beyond a forwarding role.

Out of Scope (Deferred)

ACL flag changes via otterwiki admin UI — needs lifecycle hook or per-wiki DB plan
Token issuance/refresh events for the ATProto OAuth session — currently no post-refresh hook; low priority since the platform JWT (24h) is the user-facing credential
Feeding into a monitoring dashboard — tracked separately
Structured log querying / alerting — revisit when user base grows beyond single-digit wikis