DEEP DIVE ยท 03 ยท THICKEST

Claude Cowork
the full research

GA since 2026-02-24. An autonomous desktop agent aimed at non-engineer knowledge workers. Crosses local files, Gmail, and Drive. OSWorld jumped from under 15% in late 2025 to 72.5% by February โ€” a fivefold gain in a year. The optional part of the workshop, but the one with the biggest potential impact on exec work.

GA Mac / Win Team / Enterprise RBAC

1. What Claude Cowork is

Claude Cowork โ€” official spelling Cowork, not CoWork or Co-Work โ€” is Anthropic's desktop automation agent. It's built for knowledge workers doing non-technical jobs: analysts, lawyers, finance, researchers.

How it sits in the product line

  • Runs inside Claude Desktop โ€” a capability of the macOS and Windows app
  • Not the same as Claude Code. Code is for developers (VS Code, terminal). Cowork is for everyone else (local files, apps, browser).
  • Design philosophy. "Outcome-centric" rather than "prompt-centric." You describe what you want. Claude plans the steps and runs them.
  • Permission scope. Folder-based. Only the folders you allow, only what's in them.

Core capabilities

CapabilityWhat it does
File organizationRename, sort, dedupe documents
Document generationMerge multiple sources into a structured draft
Data extractionContracts and reports turned from prose into structured data
Research synthesisSearch, summarize, and combine across sources
Process automationExcel models, decks, multi-step workflows

2. Release history

Research Preview (2026-01-12)

Announced on 2026-01-12 as a research preview. macOS only, Pro and Max subscribers.

General Availability (2026-02-24)

GA on 2026-02-24. Same day, Anthropic shipped:

  • 13 new enterprise connectors โ€” Google Calendar, Google Drive, Gmail, DocuSign, FactSet, and others
  • Cross-application context โ€” tasks spanning Excel, PowerPoint, and other apps
  • Private plugin marketplace for in-house custom plugins
  • Industry templates for finance, legal, HR, operations

OSWorld benchmark trajectory

WhenScoreModelNote
Late 2024<15%Claude Opus 4.5Early Computer Use
Feb 202672.5%Claude Sonnet 4.6At GA
Apr 202678.0%Claude Opus 4.7OSWorld-Verified

OSWorld is a 369-task benchmark covering file management, web, Office apps, multimedia, and OS operations. Human baseline: 87%.

3. Architecture

Hybrid by design

Cowork runs as a local + cloud hybrid.

Local side

  • File operations โ€” read, edit, create, delete, only in allowed folders
  • Screen operations โ€” window control, mouse, keyboard
  • Browser automation โ€” a controlled Chrome instance
  • OS commands โ€” a restricted set

Cloud side

  • Inference โ€” Claude Opus / Sonnet for planning and judgment
  • State management โ€” context preserved across multi-step tasks
  • Connector brokering โ€” API calls to FactSet, DocuSign, etc.

Internet is required

Cowork talks to Anthropic's cloud constantly. If the connection drops, the local VM keeps running but the cloud loses context โ€” which means you can end up in an inconsistent state.

4. Connector catalog (as of 2026-04-21)

13 connectors GA today. Jira and ServiceNow are on the enterprise roadmap in beta, not yet on the GA list.

ConnectorCapabilitiesUse case
Google DriveRead, generate filesDocument search and synthesis
GmailSearch mail, draft repliesTriage and response management
Google CalendarQuery schedule, create eventsCalendar ops, coordination
DocuSignRead contracts, extract key termsRisk review, redline proposals
FactSetCompany and market data lookupFinancial analysis, comp tables
SlackChannel and message searchKnowledge extraction, sharing
SalesforceCRM read and updateSales data ops
LegalZoomLegal template searchContract templating
ApolloLead database searchSales and marketing automation
ClayData integration and enrichmentSales intelligence
OutreachSales sequencesSales automation
MSCIESG dataESG reporting
WordPressSite managementBlog publishing automation

5. Plans and pricing

Individual plans

PlanMonthlyCowork limitsContextBest for
Pro$20Small200K tokensBase users
Max 5x$100Medium200K tokensPower users
Max 20x$200Large200K tokensHigh volume

Team plans

PlanMin seatsStandard seatPremium seatFeatures
Standard5$20/monthN/AClaude chat only
Premium5$20/month$100/monthCode + Cowork

Enterprise

ItemDetail
Seat priceSales quote (custom)
Extended context500K tokens
RBACGroups, role definitions, per-feature control
HIPAAAvailable on agreement

6. Comparisons

Claude Cowork vs. Microsoft Copilot Cowork

AttributeClaude CoworkMicrosoft Copilot Cowork
Release2026-02-24 (GA)2026-03-09 (research) / 03-30 (Frontier)
Base modelClaude Opus / SonnetClaude, supplied by Anthropic
HostClaude Desktop (standalone)Inside Microsoft 365
Local filesFullM365 files only
Browser automationYesOutlook / Teams only
App coverageGeneral (any desktop app)Outlook, Teams, Excel, Word, PowerPoint
Price$20โ€“200/month (individual)$99/month (E7 inclusive)

Claude Cowork vs. ChatGPT Operator

AttributeClaude CoworkChatGPT Operator
InterfaceClaude DesktopWeb / API
File accessLocal filesystemCloud storage
Browser automationYesYes โ€” stronger for web tasks
Security modelLocal sandbox + cloud inferenceCloud agent

7. Twenty exec use cases

Five each for the CEO, CFO, Legal, and IR. Time savings are industry-benchmark rough estimates.

CEO (5)

#TaskOutputTime saved
1Weekly exec briefExecutive summary rolled up from multiple division reports4h โ†’ 30min
2Board deck QACross-version number and layout consistency checks2h โ†’ 15min
3Media mentionsGoogle Alerts rolled into a Google Doc15min/day โ†’ 1min
4Advisor-interview extractionTranscript to key quotes and action items2h โ†’ 20min
5Earnings Q&A draftPrior-quarter answers and IR drafts into new-quarter candidate Q&A3h โ†’ 40min

CFO (5)

#TaskOutputTime saved
6Expense categorizationMulti-currency receipt images to CSV3h/month โ†’ 10min
7Vendor contract review (first pass)DocuSign integration, deltas vs. standard terms2h โ†’ 20min
8Three-month rolling forecastLast month's actuals + division projections into an updated Excel model5h โ†’ 30min
9Subsidiary P&L roll-upMerge and summarize subsidiary sheets4h โ†’ 45min
10Bank and institutional investor Q&A prepPrior Q&A plus drafts into candidate answers3h โ†’ 40min

Legal (5)

#TaskOutputTime saved
11NDA and standard contract risk screenDocuSign + Harvey, risk flags30min/contract โ†’ 3min
12Regulatory update memosRegulator publications to an internal-impact memo2h โ†’ 25min
13Litigation email evidenceGmail search, timeline, summary4h โ†’ 40min
14Template-update notificationLegalZoom new version to internal diff doc2h โ†’ 20min
15Compliance Q&A logEmployee training questions and answers1h/month โ†’ 10min

IR (5)

#TaskOutputTime saved
16Analyst call summaryAnalyst questions grouped by topic with draft answers3h โ†’ 35min
17Competitor benchmark refreshCompetitor IR disclosures into a comparison table4h โ†’ 1h
18ESG scorecard (annual)MSCI / S&P data plus company data to a draft report8h โ†’ 2h
19Monthly citizen investor list updateCrunchbase / Pitchbook aggregation2h โ†’ 15min
20Ten-year Q&A encyclopedia maintenancePast earnings transcripts plus FAQ tool refresh6h/decade โ†’ 40min

8. MIXI deployment scenarios

Note. These are hypothetical. Any real rollout needs a proper scoping conversation with Anthropic Sales. Not built from MIXI internal data โ€” just from how corporate strategy, legal, and IR functions typically run.

Scenario A: Strategy โ€” weekly brief automation

Current state

Strategy team pulls a weekly brief for the exec team every Monday morning. Data from several systems and Google Sheets, hand-assembled in Google Docs, shared on Slack. Time: 3โ€“4 hours/week

With Cowork

  1. Wire up market and competitor APIs (Similarweb, MSCI, etc.)
  2. Schedule Cowork to run Friday 17:00:
    • Pull last week's numbers from Google Sheets
    • Pull competitor movement from APIs
    • Generate a new Google Doc from template
    • Post summary to Slack
  3. Monday morning: 30-minute review and tweak by the strategy team

Result

Weekly work: 3โ€“4h down to 30min. Annual savings โ‰ˆ 120โ€“150 hours.

Scenario B: Legal โ€” contract risk triage

Current state

Legal reviews every contract coming in from business units. 30โ€“45 minutes per contract.

With Cowork + DocuSign

  1. DocuSign connector pulls unsigned contracts
  2. Cowork runs the first pass:
    • Compare against standard terms
    • Flag risk items
    • Assign a level โ€” GREEN / YELLOW / RED
    • Log results in Excel
  3. RED and YELLOW pings Legal on Slack

Result

First-pass screen: 45min โ†’ 3min/contract. Monthly savings โ‰ˆ 7โ€“8 hours, โ‰ˆ 100 hours/year.

Scenario C: IR โ€” ten-year Q&A encyclopedia refresh

Current state

Ten years of investor meeting transcripts. Full refresh every 5โ€“10 years takes 6โ€“8 hours.

With Cowork

  1. Drop the last decade of transcripts (150 files) into Google Drive
  2. Cowork searches, categorizes, updates, and emits a markdown FAQ
  3. IR reviews the output (1โ€“2 hours)

Result

Manual work: 6โ€“8h down to 2h. 4โ€“6 hours saved per refresh.

9. Security model

RBAC

Enterprise plan only. Not available on Pro, Max, or Team.

Audit logs

Serious gap. Cowork activity isn't written to the audit log (as of 2026-04-21). Financial supervision, legal discovery, medical workflows โ€” anywhere a trail is legally or regulatorily required โ€” Cowork is the wrong tool today.
ActivityIn audit log?
Claude ChatYes
Claude Code executionYes
Cowork tasksNo
File read/writeNo

Local file permissions

Cowork runs inside a Linux sandbox inside Claude Desktop. Read / write / delete are scoped to folders the user explicitly allowed.

10. Japan operational considerations

Data residency

Where it is now. All Cowork inference runs on Anthropic US infrastructure. No region selection. If your work has residency requirements, Cowork isn't a fit today.

Bedrock workaround

Integrating Claude via AWS Bedrock lets you process in the Tokyo region:

ANTHROPIC_BASE_URL=https://bedrock.ap-northeast-1.amazonaws.com

But Cowork doesn't support Bedrock routing. This workaround only applies to Claude API and Claude Code CLI.

APPI (Japan's data protection law)

RequirementCowork status
Subject consentYes โ€” folder access prompt
Purpose specificationNot clear
Safe managementPartial โ€” sandbox isolated, but transferred to the US
Third-party transfer restrictionData sent to Anthropic may qualify as "transfer"
Recommendation for MIXI. Keep APPI-covered data (employee info, customer lists, identifiable IDs) off Cowork. Scope Cowork to non-personal work โ€” market analysis, public-document summarization, competitor research.
AUTOMATION GOVERNANCE

Deciding how far to let automation go

"OK, Cowork can automate these things. Now โ€” how far does the company let it go?" This is the question that burns the most clock during a Cowork rollout. What's technically possible and what the organization should allow aren't the same list.

Three Trust Levels

Every action gets sorted into one of three tiers: autonomous, semi-autonomous, manual. The whole point is to kill grey areas.

LevelMeaningGood forBad for
Watched
(human in the loop)
Agent proposes, human approves every step Sending mail, deleting files, external charges, HR access Nothing โ€” but if everything lands here, you lose the productivity
Batched
(review the output)
Agent runs the full task, human reviews results Document drafting, data extraction, renaming, draft emails (held before send) Anything that goes external, anything irreversible
Autonomous
(trust it)
Routine work that doesn't need human eyes Local read-only ops, standard roll-ups, calendar tidying, minutes formatting Unbounded-cost workloads, external API calls

Action type ร— default Trust Level

A starting draft for MIXI. Ready to hand to legal and IT at this level of detail.

ActionRecommended Trust LevelWhy
Local file readAutonomousFailure cost is near zero
Local file write (new)BatchedDisk exhaustion, overwrite risk
Local file deleteWatchedIrreversible. Bad delete is expensive.
Gmail search / readBatchedScope needs a human check
Gmail draftBatchedSend is a separate approval
Gmail sendWatched (always)Wrong send is irreversible
Drive search / readAutonomousRead-only is fine
Drive new fileBatchedVerify sharing settings
Drive edit existingWatchedAffects co-editors
Slack message sendWatchedThe whole channel reads it
Calendar event addBatchedSelf-only vs. inviting others are different
Web search / scrapeAutonomousRead-only
External SaaS writeWatchedFailures are public
Payments / billing APIsForbidden in CoworkFailure mode is existential
Git pushWatchedHistory is permanent

Blast radius thinking

Before automating anything, ask one question. "If this misfires 100 times, what happens?"

Case 1. File rename automation. 100 misfires = 100 badly-named files. Recoverable. Batched works.
Case 2. Automated customer email. 100 misfires = 100 customers got a weird message. Brand hit, potentially a legal matter. Watched, no exceptions.
Case 3. Automated invoicing. 100 misfires = 100 wrong invoices, refunds, trust destroyed. Don't run this on Cowork. Dedicated system plus dual approval.

Company policy template โ€” MIXI draft

An A4-sized policy ready to hand to legal and IT.

# MIXI Cowork / AI Agent Automation Policy v0.1 (draft)

## 1. Scope
This policy applies to Claude Cowork and equivalent autonomous
AI agents used inside MIXI.

## 2. Trust Levels
All automation is classified as Watched, Batched, or Autonomous.
The classification is documented for each workflow.

## 3. Forbidden areas
The following are not run through Cowork. They stay on existing
systems with dual approval.
- Payments, billing, transfers
- Writes to HR or payroll data
- Sending contracts with legal consequence
- Customer PII sent to external APIs

## 4. Audit
- Batched and Autonomous action logs retained 180 days
- Monthly anomaly review by IT, reported to Strategy
- Incidents escalated to the CTO within 24 hours

## 5. Review cadence
Revisit this policy every six months. New features (new
connectors, new capabilities) trigger an interim revision.

## 6. Violations
- First occurrence: warning plus retraining
- Second: Cowork access suspended
- Third: disciplinary review

## 7. Revision history
v0.1 (2026-04-22): initial draft
Operational honesty. Policies tend to get written and forgotten. Without a twice-yearly review and a retrospective loop from incidents, this goes stale in a year. Cowork ships new connectors quarterly โ€” the policy has to keep up.

11. Ten adoption pitfalls

#PitfallRiskMitigation
1No audit log for CoworkRegulated activity can't be tracedKeep it out of regulated workflows
2Internet requiredHalf-broken state when offlineOffline work uses the old method
3NDA and trainingAnthropic training on data is the default postureEnterprise contract plus the No Training option
4Destructive ops don't undoWrong delete is goneTest environments, version control
5No data residencyJapan PII processed in the USRestrict to non-personal workflows
6RBAC is Enterprise-onlyPro / Max: every user has equal powerMove to Enterprise earlier than you think
7Plugin data leak riskCompromised API keys become company-wide incidentsKey rotation SOP
8File ops are irreversibleBad renames, bad deletesRead-only folder permissions where possible
9OSWorld vs. reality72% is the ideal-app score. Real-world: 50โ€“60%.Start on low-stakes tasks
10Anthropic dependencyVendor lock-inKeep eyes on alternatives

12. 30-day pilot plan

Anthropic doesn't offer a formal 30-day trial program. This is a suggested self-run pilot structure.

Phase 1 โ€” plan and permissions (Day 1โ€“3)

TaskOwnerDetail
Select participantsStrategy5โ€“10 division heads plus 2 IT admins
Prepare foldersITIsolated pilot folders
Training materialsHR + tech30-minute onboarding video, FAQ
NDA / policy updateLegalExplicit language on data sent to Anthropic
LicensingProcurementPro / Team trial โ€” 10 licenses, 30 days

Phase 2 โ€” initial onboarding (Day 4โ€“7)

  • Day 4. Kickoff meeting. Leadership sets the expectations.
  • Day 5โ€“6. Individual training. One hour per user, on their own use case.
  • Day 7. Technical check. Security settings, connector tests.

Phase 3 โ€” real tasks (Day 8โ€“25)

TeamTaskWeekly cadenceTarget
StrategyWeekly brief1xManual 3h โ†’ 30min
CFOExpense categorization1x (month-end)80%+ initial accuracy
LegalContract risk review2โ€“3 contracts50% less review time
IRAnalyst Q&A draft1xCut 2h off the draft

Phase 4 โ€” feedback and tuning (Day 26โ€“28)

  • Day 26. User survey โ€” usability, accuracy, pain points.
  • Day 27. IT and Anthropic support evaluation.
  • Day 28. Report for leadership. Hours saved, rough ROI.

Phase 5 โ€” decision (Day 29โ€“30)

OutcomeThresholdNext step
Success75%+ target completion, satisfaction 4.0+/5.0Company-wide rollout, Q3 target
Promising, needs work50โ€“75% completion2โ€“4 more weeks
MissBelow 50% or safety concernsStop. Evaluate alternatives.

13. Official sources

The primary sources are in the Sources block below.