OCC
Office Cloc and Count — document metrics, structure extraction, content inspection, and code exploration for real repositories.
Experimental: All features in OCC are currently experimental. This project cannot be considered stable software yet. APIs, output formats, and command interfaces may change between minor versions.
OCC provides several command families:
occ [directories...]for office document metrics, structure extraction, andscccode metricsocc doc inspect,occ sheet inspect,occ slide inspectfor format-specific document preflightocc table inspectfor structured table content extractionocc code ...for on-demand repository exploration over an in-memory code graphocc workspace ...for combined workspace analysis (code + documents + structures) and cross-document reference detection
Feature Highlights¶
- Office document metrics for DOCX, XLSX, PPTX, PDF, ODT, ODS, and ODP
- Document structure extraction with
--structure - Document inspection via
occ doc inspect— metadata, risk flags, content stats, and preview for DOCX/ODT - Spreadsheet inspection via
occ sheet inspect— schema preview, risk flags, and token estimates for XLSX - Presentation inspection via
occ slide inspect— slide inventory, risk flags, and content preview for PPTX/ODP - Table extraction via
occ table inspect— structured table data from DOCX, XLSX, PPTX, ODT, and ODP - Code metrics via scc during default scans
- Code exploration via
occ codewith exact search, pattern search, content search, callers/callees, dependency categories, inheritance, module coupling, and blocked-chain reporting - Workspace analysis via
occ workspace— combined code, document, and structure analysis with versioned JSON contracts and cross-document reference detection - Explicit relationship status through
resolved,ambiguous, andunresolved - Dependency categorization into local, external, and unresolved imports
- JSON-first automation support across all command families
- Zero required services for
occ code; no database or daemon - Programmatic library access via subpath exports for the code exploration module
Why OCC?¶
Tools like scc, cloc, and tokei give you fast visibility into code. OCC extends that visibility to the rest of the repository:
- office documents that usually sit outside engineering metrics
- document structure that is useful for navigation and RAG pipelines
- repository code relationships that are useful for interactive exploration and agent workflows
For Humans¶
- Project audits — quantify documentation footprint alongside source code
- Migration planning — quickly find both the documents and the symbols that matter
- Onboarding — scan a repo once, then drill into specific classes, functions, and dependencies
For AI Agents¶
- Context budgeting — estimate document volume before ingestion
- Document triage — inspect documents, spreadsheets, and presentations for risk flags and token estimates before deep reading
- Table extraction — extract structured table data directly from documents without parsing raw XML
- Repository mapping — combine
occ --format jsonfor document inventory withocc code ... --format jsonfor symbol and relationship data - RAG chunk mapping — use
--structure --format jsonto recover section boundaries and character offsets - Workspace overview —
occ workspace analyze --format jsongives agents a combined code + document + structure snapshot in one call
Quick Install¶
# Global install
npm i -g @cesarandreslopez/occ
occ
# No-install usage
npx @cesarandreslopez/occ docs/ reports/
Quick Examples¶
# Document metrics + scc summary
occ docs/
# Document structure
occ --structure docs/
# Inspect a document for metadata and risk flags
occ doc inspect report.docx --format json
# Extract tables from a spreadsheet
occ table inspect finance.xlsx --format json
# Exact symbol lookup
occ code find name UserService --path .
# Inspect outgoing calls with ambiguity reporting
occ code analyze calls ambiguousCaller --path test/fixtures/code-explore
# Dependency inspection with local/external/unresolved grouping
occ code analyze deps src/deps --path test/fixtures/code-explore
# Call chain that reports blocked ambiguity
occ code analyze chain ambiguousCaller duplicate --path test/fixtures/code-explore
Next Steps¶
- Installation — install methods and runtime notes
- Quick Start — first-run walkthrough for both command families
- CLI Reference — full command and flag reference
- Output Formats — tabular and JSON payloads
- Architecture — document pipeline and code graph internals