Skip to content

ID Generation

HtmlGraph provides collision-resistant, hash-based ID generation for multi-agent collaboration. This system prevents conflicts when multiple agents create tasks concurrently.

📖 Design Document: For detailed architecture, implementation details, and rationale, see Hash-Based IDs Design Document.

Overview

Traditional timestamp-based IDs (feature-20241222-143022) can collide when two agents create features at the same second. Hash-based IDs eliminate this problem by combining:

  • Title (content-addressability)
  • Timestamp (microsecond precision)
  • Random entropy (4 bytes by default)

ID Format

IDs follow the format {prefix}-{hash}:

Node Type Prefix Example
Feature feat- feat-a1b2c3d4
Bug bug- bug-12345678
Chore chr- chr-deadbeef
Spike spk- spk-87654321
Epic epc- epc-abcdef12
Session sess- sess-7890abcd
Track trk- trk-fedcba98
Phase phs- phs-11223344
Spec spec- spec-55667788
Plan plan- plan-99aabbcc

Basic Usage

Generating IDs

from htmlgraph import generate_id

# Generate a feature ID
feature_id = generate_id("feature", "User Authentication")
# → "feat-a1b2c3d4"

# Generate a bug ID
bug_id = generate_id("bug", "Login fails on Safari")
# → "bug-12345678"

# Generate a track ID
track_id = generate_id("track", "OAuth Integration")
# → "trk-abcdef12"

Automatic Generation

When using the SDK or CLI, IDs are generated automatically:

from htmlgraph import SDK

sdk = SDK(agent="claude")

# ID generated automatically
feature = sdk.features.create(
    title="User Authentication"
).set_priority("high").save()

print(feature.id)  # → "feat-7f3a2b1c"
# CLI also generates hash-based IDs
htmlgraph feature create "User Authentication" --priority high
# Created: feat-9e8d7c6b

Hierarchical IDs

For sub-tasks, use hierarchical IDs:

from htmlgraph import generate_hierarchical_id

# Create parent feature
parent_id = generate_id("feature", "Auth System")  # → "feat-a1b2c3d4"

# Create sub-tasks
subtask1 = generate_hierarchical_id(parent_id, 1)  # → "feat-a1b2c3d4.1"
subtask2 = generate_hierarchical_id(parent_id, 2)  # → "feat-a1b2c3d4.2"

# Nested sub-tasks
nested = generate_hierarchical_id(subtask1, 1)     # → "feat-a1b2c3d4.1.1"

Parsing IDs

Extract components from any ID:

from htmlgraph import parse_id

# Parse a hash-based ID
result = parse_id("feat-a1b2c3d4.1.2")
# {
#     'prefix': 'feat',
#     'node_type': 'feature',
#     'hash': 'a1b2c3d4',
#     'hierarchy': [1, 2],
#     'is_legacy': False
# }

# Parse a legacy ID
result = parse_id("feature-20241222-143022")
# {
#     'prefix': 'feature',
#     'node_type': 'feature',
#     'hash': '20241222-143022',
#     'hierarchy': [],
#     'is_legacy': True
# }

Validation

Check if IDs are valid:

from htmlgraph import is_valid_id, is_legacy_id

# Hash-based IDs
is_valid_id("feat-a1b2c3d4")      # → True
is_valid_id("feat-a1b2c3d4.1.2")  # → True

# Legacy IDs (still valid)
is_valid_id("feature-20241222-143022")  # → True

# Invalid IDs
is_valid_id("invalid")            # → False
is_valid_id("feat-xyz")           # → False (not hex)

# Check legacy format
is_legacy_id("feature-20241222-143022")  # → True
is_legacy_id("feat-a1b2c3d4")            # → False

Hierarchy Helpers

Navigate hierarchical IDs:

from htmlgraph.ids import get_parent_id, get_root_id, get_depth

id = "feat-a1b2c3d4.1.2"

get_parent_id(id)  # → "feat-a1b2c3d4.1"
get_root_id(id)    # → "feat-a1b2c3d4"
get_depth(id)      # → 2

# Root IDs have no parent
get_parent_id("feat-a1b2c3d4")  # → None
get_depth("feat-a1b2c3d4")      # → 0

Collision Resistance

With 4 bytes of random entropy (default), the probability of collision is approximately 1 in 4 billion per ID generated. Combined with microsecond timestamps and title hashing, collisions are effectively impossible even with thousands of concurrent agents.

# Generate 1000 IDs with identical titles
ids = [generate_id("feature", "Same Title") for _ in range(1000)]
unique = len(set(ids))
print(f"Generated {unique} unique IDs")  # → "Generated 1000 unique IDs"

Backward Compatibility

Legacy timestamp-based IDs remain fully supported:

  • Existing features with old IDs continue to work
  • parse_id() correctly identifies legacy format
  • is_valid_id() accepts both formats
  • No migration required

API Reference

generate_id

generate_id(node_type: str = 'feature', title: str = '', entropy_bytes: int = 4) -> str

Generate a collision-resistant ID.

Format: {prefix}-{hash} (e.g., feat-a1b2c3d4)

The hash is derived from: - Title (for some content-addressability) - Timestamp (microsecond precision in UTC) - Random bytes (entropy)

PARAMETER DESCRIPTION
node_type

Type of node (feature, bug, chore, etc.)

TYPE: str DEFAULT: 'feature'

title

Node title (used in hash for content-addressability)

TYPE: str DEFAULT: ''

entropy_bytes

Number of random bytes to include (default 4)

TYPE: int DEFAULT: 4

RETURNS DESCRIPTION
str

A collision-resistant ID like "feat-a1b2c3d4"

Example

generate_id("feature", "User Authentication") 'feat-7f3a2b1c' generate_id("bug", "Login fails on Safari") 'bug-9e8d7c6b'

Source code in src/python/htmlgraph/ids.py
def generate_id(
    node_type: str = "feature",
    title: str = "",
    entropy_bytes: int = 4,
) -> str:
    """
    Generate a collision-resistant ID.

    Format: {prefix}-{hash} (e.g., feat-a1b2c3d4)

    The hash is derived from:
    - Title (for some content-addressability)
    - Timestamp (microsecond precision in UTC)
    - Random bytes (entropy)

    Args:
        node_type: Type of node (feature, bug, chore, etc.)
        title: Node title (used in hash for content-addressability)
        entropy_bytes: Number of random bytes to include (default 4)

    Returns:
        A collision-resistant ID like "feat-a1b2c3d4"

    Example:
        >>> generate_id("feature", "User Authentication")
        'feat-7f3a2b1c'
        >>> generate_id("bug", "Login fails on Safari")
        'bug-9e8d7c6b'
    """
    prefix = PREFIXES.get(node_type, node_type[:4].lower())

    # Combine multiple sources of uniqueness
    timestamp = datetime.now(timezone.utc).isoformat()
    random_bytes = os.urandom(entropy_bytes)

    # Create hash from all sources
    content = f"{title}:{timestamp}".encode() + random_bytes
    hash_digest = hashlib.sha256(content).hexdigest()[:8]

    return f"{prefix}-{hash_digest}"

generate_hierarchical_id

generate_hierarchical_id(parent_id: str, index: int | None = None) -> str

Generate a sub-task ID under a parent.

Format: {parent_id}.{index} (e.g., feat-a1b2c3d4.1)

If index is not provided, it auto-increments based on existing siblings (requires filesystem check).

PARAMETER DESCRIPTION
parent_id

The parent node's ID

TYPE: str

index

Sub-task index (1-based). Auto-assigned if None.

TYPE: int | None DEFAULT: None

RETURNS DESCRIPTION
str

A hierarchical ID like "feat-a1b2c3d4.1" or "feat-a1b2c3d4.1.2"

Example

generate_hierarchical_id("feat-a1b2c3d4", 1) 'feat-a1b2c3d4.1' generate_hierarchical_id("feat-a1b2c3d4.1", 2) 'feat-a1b2c3d4.1.2'

Source code in src/python/htmlgraph/ids.py
def generate_hierarchical_id(
    parent_id: str,
    index: int | None = None,
) -> str:
    """
    Generate a sub-task ID under a parent.

    Format: {parent_id}.{index} (e.g., feat-a1b2c3d4.1)

    If index is not provided, it auto-increments based on
    existing siblings (requires filesystem check).

    Args:
        parent_id: The parent node's ID
        index: Sub-task index (1-based). Auto-assigned if None.

    Returns:
        A hierarchical ID like "feat-a1b2c3d4.1" or "feat-a1b2c3d4.1.2"

    Example:
        >>> generate_hierarchical_id("feat-a1b2c3d4", 1)
        'feat-a1b2c3d4.1'
        >>> generate_hierarchical_id("feat-a1b2c3d4.1", 2)
        'feat-a1b2c3d4.1.2'
    """
    if index is None:
        raise ValueError("index is required (auto-increment not yet implemented)")

    if index < 1:
        raise ValueError("index must be >= 1")

    return f"{parent_id}.{index}"

parse_id

parse_id(node_id: str) -> dict[str, str | int | list[int] | None]

Parse an ID into its components.

PARAMETER DESCRIPTION
node_id

The ID to parse

TYPE: str

RETURNS DESCRIPTION
dict[str, str | int | list[int] | None]

Dictionary with:

dict[str, str | int | list[int] | None]
  • prefix: The type prefix (e.g., "feat")
dict[str, str | int | list[int] | None]
  • node_type: The full node type (e.g., "feature")
dict[str, str | int | list[int] | None]
  • hash: The hash portion (e.g., "a1b2c3d4")
dict[str, str | int | list[int] | None]
  • hierarchy: List of sub-indices (e.g., [1, 2] for ".1.2")
dict[str, str | int | list[int] | None]
  • is_legacy: Whether this is an old-format ID
Example

parse_id("feat-a1b2c3d4.1.2") { 'prefix': 'feat', 'node_type': 'feature', 'hash': 'a1b2c3d4', 'hierarchy': [1, 2], 'is_legacy': False }

Source code in src/python/htmlgraph/ids.py
def parse_id(node_id: str) -> dict[str, str | int | list[int] | None]:
    """
    Parse an ID into its components.

    Args:
        node_id: The ID to parse

    Returns:
        Dictionary with:
        - prefix: The type prefix (e.g., "feat")
        - node_type: The full node type (e.g., "feature")
        - hash: The hash portion (e.g., "a1b2c3d4")
        - hierarchy: List of sub-indices (e.g., [1, 2] for ".1.2")
        - is_legacy: Whether this is an old-format ID

    Example:
        >>> parse_id("feat-a1b2c3d4.1.2")
        {
            'prefix': 'feat',
            'node_type': 'feature',
            'hash': 'a1b2c3d4',
            'hierarchy': [1, 2],
            'is_legacy': False
        }
    """
    # Try new hash-based format
    match = HASH_ID_PATTERN.match(node_id)
    if match:
        prefix = match.group(1)
        hash_part = match.group(2)

        # Extract hierarchy by finding all .N segments after the hash
        # The regex captures the last one, but we need all of them
        hierarchy = []
        base_id = f"{prefix}-{hash_part}"
        if len(node_id) > len(base_id):
            hierarchy_str = node_id[len(base_id) :]  # e.g., ".1.2"
            hierarchy = [int(x) for x in hierarchy_str.split(".") if x]

        return {
            "prefix": prefix,
            "node_type": PREFIX_TO_TYPE.get(prefix, prefix),
            "hash": hash_part,
            "hierarchy": hierarchy,
            "is_legacy": False,
        }

    # Try legacy format (feature-20241222-143022)
    legacy_match = LEGACY_ID_PATTERN.match(node_id)
    if legacy_match:
        prefix = legacy_match.group(1)
        timestamp = legacy_match.group(2)

        return {
            "prefix": prefix,
            "node_type": prefix,  # Legacy uses full type as prefix
            "hash": timestamp,
            "hierarchy": [],
            "is_legacy": True,
        }

    # Unknown format
    return {
        "prefix": None,
        "node_type": None,
        "hash": None,
        "hierarchy": [],
        "is_legacy": None,
    }

is_valid_id

is_valid_id(node_id: str) -> bool

Check if an ID is valid (either new or legacy format).

PARAMETER DESCRIPTION
node_id

The ID to validate

TYPE: str

RETURNS DESCRIPTION
bool

True if the ID matches a known format

Example

is_valid_id("feat-a1b2c3d4") True is_valid_id("feat-a1b2c3d4.1.2") True is_valid_id("feature-20241222-143022") True is_valid_id("invalid") False

Source code in src/python/htmlgraph/ids.py
def is_valid_id(node_id: str) -> bool:
    """
    Check if an ID is valid (either new or legacy format).

    Args:
        node_id: The ID to validate

    Returns:
        True if the ID matches a known format

    Example:
        >>> is_valid_id("feat-a1b2c3d4")
        True
        >>> is_valid_id("feat-a1b2c3d4.1.2")
        True
        >>> is_valid_id("feature-20241222-143022")
        True
        >>> is_valid_id("invalid")
        False
    """
    return bool(HASH_ID_PATTERN.match(node_id) or LEGACY_ID_PATTERN.match(node_id))

is_legacy_id

is_legacy_id(node_id: str) -> bool

Check if an ID uses the legacy timestamp format.

PARAMETER DESCRIPTION
node_id

The ID to check

TYPE: str

RETURNS DESCRIPTION
bool

True if this is a legacy format ID (e.g., feature-20241222-143022)

Example

is_legacy_id("feature-20241222-143022") True is_legacy_id("feat-a1b2c3d4") False

Source code in src/python/htmlgraph/ids.py
def is_legacy_id(node_id: str) -> bool:
    """
    Check if an ID uses the legacy timestamp format.

    Args:
        node_id: The ID to check

    Returns:
        True if this is a legacy format ID (e.g., feature-20241222-143022)

    Example:
        >>> is_legacy_id("feature-20241222-143022")
        True
        >>> is_legacy_id("feat-a1b2c3d4")
        False
    """
    return bool(LEGACY_ID_PATTERN.match(node_id))

get_parent_id

get_parent_id(node_id: str) -> str | None

Get the parent ID for a hierarchical ID.

PARAMETER DESCRIPTION
node_id

A hierarchical ID like "feat-a1b2c3d4.1.2"

TYPE: str

RETURNS DESCRIPTION
str | None

The parent ID ("feat-a1b2c3d4.1") or None if not hierarchical

Example

get_parent_id("feat-a1b2c3d4.1.2") 'feat-a1b2c3d4.1' get_parent_id("feat-a1b2c3d4.1") 'feat-a1b2c3d4' get_parent_id("feat-a1b2c3d4") None

Source code in src/python/htmlgraph/ids.py
def get_parent_id(node_id: str) -> str | None:
    """
    Get the parent ID for a hierarchical ID.

    Args:
        node_id: A hierarchical ID like "feat-a1b2c3d4.1.2"

    Returns:
        The parent ID ("feat-a1b2c3d4.1") or None if not hierarchical

    Example:
        >>> get_parent_id("feat-a1b2c3d4.1.2")
        'feat-a1b2c3d4.1'
        >>> get_parent_id("feat-a1b2c3d4.1")
        'feat-a1b2c3d4'
        >>> get_parent_id("feat-a1b2c3d4")
        None
    """
    if "." not in node_id:
        return None

    return node_id.rsplit(".", 1)[0]

get_root_id

get_root_id(node_id: str) -> str

Get the root ID (without hierarchy) for any ID.

PARAMETER DESCRIPTION
node_id

Any ID, possibly hierarchical

TYPE: str

RETURNS DESCRIPTION
str

The root ID without hierarchy portion

Example

get_root_id("feat-a1b2c3d4.1.2") 'feat-a1b2c3d4' get_root_id("feat-a1b2c3d4") 'feat-a1b2c3d4'

Source code in src/python/htmlgraph/ids.py
def get_root_id(node_id: str) -> str:
    """
    Get the root ID (without hierarchy) for any ID.

    Args:
        node_id: Any ID, possibly hierarchical

    Returns:
        The root ID without hierarchy portion

    Example:
        >>> get_root_id("feat-a1b2c3d4.1.2")
        'feat-a1b2c3d4'
        >>> get_root_id("feat-a1b2c3d4")
        'feat-a1b2c3d4'
    """
    parsed = parse_id(node_id)
    if parsed["prefix"] and parsed["hash"]:
        return f"{parsed['prefix']}-{parsed['hash']}"
    return node_id.split(".")[0]

get_depth

get_depth(node_id: str) -> int

Get the hierarchy depth of an ID.

PARAMETER DESCRIPTION
node_id

Any ID

TYPE: str

RETURNS DESCRIPTION
int

0 for root IDs, 1+ for hierarchical IDs

Example

get_depth("feat-a1b2c3d4") 0 get_depth("feat-a1b2c3d4.1") 1 get_depth("feat-a1b2c3d4.1.2") 2

Source code in src/python/htmlgraph/ids.py
def get_depth(node_id: str) -> int:
    """
    Get the hierarchy depth of an ID.

    Args:
        node_id: Any ID

    Returns:
        0 for root IDs, 1+ for hierarchical IDs

    Example:
        >>> get_depth("feat-a1b2c3d4")
        0
        >>> get_depth("feat-a1b2c3d4.1")
        1
        >>> get_depth("feat-a1b2c3d4.1.2")
        2
    """
    parsed = parse_id(node_id)
    hierarchy = parsed.get("hierarchy", [])
    if isinstance(hierarchy, list):
        return len(hierarchy)
    return 0