Accounts And Data Model

The application data model is intentionally project-centered. Projects are user-defined metadata entities for building relationships between external files, source-code pages, requirements, notes, decisions, ownership scopes, and collaboration groups. The UI calls user-uploaded external artifacts files; the shared model stores them as document* artifact records because they carry extracted metadata, write history, storage paths, and cross-object links. Those artifacts resolve through a project for permissions, billing, storage paths, audit history, and most workflow state. Teams and accounts organize projects and grant access; they do not replace the project as the core workspace object.

The executable source of truth for this model is src/lib/workspace-model.js. The tests in tests/workspace-model.test.mjs pin the expected permission, storage, metadata, and backend-neutral behavior.

Local And Cloud Identity

The same logical model is used in local-only mode and cloud-connected mode. Localhost can manage accounts, projects, file artifacts, and sync metadata without cloud identity. When the app is connected to AWS or Azure, provider identity is normalized into the same application user, account membership, team membership, and project grant records.

Cloud identity answers "who signed in to this deployment." Application identity answers "which account, team, project, and document can this person access." Those two layers should not be collapsed.

Object Hierarchy

The primary relationship is:

User identity
  -> personal workspace account
  -> projects
  -> document artifacts

Client or firm account
  -> account members
  -> account-scoped teams
  -> projects
  -> document artifacts

Personal workspace team
  -> team members
  -> projects
  -> document artifacts

Legacy standalone team records can also be imported or retained for compatibility, but the current UI creates new teams inside either an account or a personal workspace account.

Projects can be in one of three assignment states:

Private workspace project: workspace_id is set, and account_id and team_id are normally null.
Team-assigned project: team_id is set. If the team is account-scoped, the project also carries the team's account_id.
Account-assigned project: account_id is set and team_id is cleared when assignment is directly to the account.

Document artifact records keep denormalized project_id, account_id, and team_id fields so uploaded files can be listed, filtered, linked to source/code references, and indexed without losing the project ownership path. The canonical permission check still resolves through the project and its grants.

For account projects, document storage uses the account-owned prefix:

accounts/{account_id}/projects/{project_id}/documents/{document_id}/raw/{filename}

For private local workspace projects, document storage uses the workspace-owned prefix:

workspaces/{workspace_id}/projects/{project_id}/documents/{document_id}/raw/{filename}

The same prefix shape is used locally, in AWS S3, and in Azure Blob Storage.

Users And Deployment Identity

There are two identity layers:

Deployment identity: AWS Cognito groups or Microsoft Entra app roles identify who signed in and which deployment-level roles are present.
Application identity: normalized user_id, account memberships, team memberships, project grants, and notification preferences decide what the user can do.

The app normalizes provider claims into an application user with normalizeUserProfile. Provider groups such as application_super_admin, PlatformAdmin, or ProjectOwner are useful deployment claims, but they are not enough to grant document access. The API must always compute effective project permissions from application records.

An application super admin can create and inspect platform-level account metadata. That role does not automatically imply document-content access to restricted projects unless a project grant, account membership, team membership, or explicit owner rule grants it.

Accounts

Application accounts are customer or workspace containers. They are different from AWS accounts or Azure subscriptions.

Use application accounts to keep client, business, department, firm, and personal workspace records logically separated. A single AWS account or Azure subscription can host many application accounts, and a single application account can later be synced from local storage to AWS, Azure, or both without changing its meaning.

Account records include:

account_id
name
type, such as personal, client_org, or consulting_firm
status
data_classification
default_project_visibility
billing_plan
owner_user_id
created_by
created_at

Account member records attach users to accounts with roles and permissions:

account_id
user_id
email
name
role, such as owner, account_admin, administrator, editor, or viewer
status
permissions, such as manage_account, create_project, create_team, invite_members, and share_project

Account owners and Super-admins have manage_account and can manage account lifecycle, ownership transfer, member removal, and administrative notification routing. The Super-admin role is stored as role: "account_admin" for backend compatibility. Account Administrators can create projects and teams, invite members, and share projects, but they do not manage account ownership or lifecycle unless they also have manage_account.

Account Separation

Account separation is enforced through application records, storage prefixes, and permission evaluation:

Account records identify the client, business, department, firm, or workspace container.
Account member records identify who can administer or participate in that account.
Account-scoped teams provide collaboration groups inside one account boundary.
Account project grants can expose projects to account members while restricted projects still require explicit grants.
Document storage keys include the account id when a project belongs to an account.

This separation is logical and portable. It does not require a separate AWS account, Azure subscription, bucket, container, or database for every application account, although production deployments can choose stronger physical isolation for regulatory or enterprise reasons.

Teams

Teams are assignment and collaboration scopes. A team can be account-scoped, personal-workspace scoped, or legacy standalone.

Team records include:

team_id
account_id
workspace_id
scope_type: account, personal_workspace, or standalone
parent_team_id
name
status
owner_user_id
root_admin_user_id
created_by
created_at

Account-scoped teams inherit co-administration from account owners and Super-admins. Personal-workspace teams must retain an active root administrator. Legacy standalone teams follow the same root-administrator rule when they are imported or retained. Team membership alone does not mutate a project; it only matters when a project is assigned to the team or when a project grant targets that team.

Projects

Projects are the main durable workspace object. A project owns its external file set, source/code relationships, access grants, object/blob prefix, metadata history, write events, status, and billing scope.

Project records include:

project_id
name
description
account_id
team_id
workspace_id
restricted
status
visibility
billing_scope
created_by
created_at
updated_at

Project assignment is explicit. Moving a project between private, team, and account scopes should update the project row and any denormalized document assignment fields. Assigning a project to an account-scoped team should copy the team's account_id; assigning directly to an account should clear team_id.

Project grants are separate records:

grant_id
project_id
target_type: user, team, or account
target_id
permissions
source
note
created_by
created_at

The current project permission ladder is view, comment, review, write, manage_access, and owner. owner expands to all project permissions.

Documents

Documents are the internal artifact records for external files. The product UI labels these objects as files, while storage, sync, API contracts, metadata records, and write events continue to use document_id and document* naming. The default owner type is project, because project ownership gives one place to resolve grants, storage paths, activity, and billing.

Document artifact records include:

document_id
project_id
account_id
team_id
owner_type
owner_id
name
title
document_type
status
classification
file_extension
mime_type
source_date
issuing_body
geographic_scope
revision
project_role
data_object_type
metadata_confidence
extraction_summary
metadata_sources
metadata_fallback_pathways
raw_metadata
size_bytes
uploaded_by
uploaded_at
storage_provider
storage_key
concept_tags
standard_relevance
project_data_elements
related_object_ids

The same storage_key shape is used for AWS S3 object keys and Azure Blob names:

accounts/{account_id}/projects/{project_id}/documents/{document_id}/raw/{filename}
workspaces/{workspace_id}/projects/{project_id}/documents/{document_id}/raw/{filename}

The storage_provider field records the selected backend, but the path is provider-neutral.

Version And History Strategy

The current project and document records hold current state. Versions, evidence, and user-authored history are saved as append-only records linked back to the document or project.

Current-state records:

Project rows store the active assignment, status, visibility, and updated_at.
Document artifact rows store the active metadata fields, current status, object/blob path, tags, related source/code references, requirement relevance, and project data elements.
Account and team rows store active membership and ownership configuration.

Append-only history records:

Metadata records capture original extraction snapshots, user-curated metadata snapshots, and write-context snapshots.
Write events capture user notes or mapping decisions with actor, timestamp, document context, and linked metadata record id.
Project grants and invitations preserve who created access changes and when.
Derived extraction outputs should be stored as child artifacts, such as chunks.json, tables.json, figures.json, or document_metadata.json, with a parent link to the raw uploaded document.

Metadata records use document-local keys:

pk = DOCUMENT#{document_id}
sk = METADATA#{timestamp}#{record_type}

Write events use:

pk = DOCUMENT#{document_id}
sk = WRITE#{timestamp}#{write_event_id}

Each metadata record carries schema_version: 1. That schema version describes the saved snapshot shape. It is separate from a source document's revision field, which describes the source file or document version inferred from metadata or filenames.

When the UI saves metadata, the artifact's current editable fields can change, but a metadata record should also be written so prior values remain reconstructable. When the UI captures a write event, it should also create a write_context metadata record so the exact visible document metadata can be reconstructed later.

Permission Evaluation

Effective project permissions are computed from:

Direct user project grants.
Team project grants plus active team membership.
Account project grants plus active account membership.
Account membership fallback for non-restricted account projects.
Owner grants created when projects are created.

The file repository can list document artifacts across visible projects, teams, accounts, and private workspaces, but it must not bypass project permissions. A file is visible only when its assigned project or related scope is visible to the caller.

Storage Indexing

A backend implementation should index the same logical item types regardless of cloud provider:

USER#user_id
ACCOUNT#account_id
ACCOUNT#account_id MEMBER#user_id
TEAM#team_id
TEAM#team_id MEMBER#user_id
PROJECT#project_id
PROJECT#project_id GRANT#grant_id
ARTIFACT#artifact_id
DOCUMENT#document_id METADATA#...
DOCUMENT#document_id WRITE#...
INVITATION#invitation_id
Notification preference records by user.

AWS stores these in DynamoDB. Azure stores the same item families in Cosmos DB. The provider changes the backing database service, not the application object model.