ID Generation¶

HtmlGraph provides collision-resistant, hash-based ID generation for multi-agent collaboration. This system prevents conflicts when multiple agents create tasks concurrently.

📖 Design Document: For detailed architecture, implementation details, and rationale, see Hash-Based IDs Design Document.

Overview¶

Traditional timestamp-based IDs (feature-20241222-143022) can collide when two agents create features at the same second. Hash-based IDs eliminate this problem by combining:

Title (content-addressability)
Timestamp (microsecond precision)
Random entropy (4 bytes by default)

ID Format¶

IDs follow the format {prefix}-{hash}:

Node Type	Prefix	Example
Feature	`feat-`	`feat-a1b2c3d4`
Bug	`bug-`	`bug-12345678`
Chore	`chr-`	`chr-deadbeef`
Spike	`spk-`	`spk-87654321`
Epic	`epc-`	`epc-abcdef12`
Session	`sess-`	`sess-7890abcd`
Track	`trk-`	`trk-fedcba98`
Phase	`phs-`	`phs-11223344`
Spec	`spec-`	`spec-55667788`
Plan	`plan-`	`plan-99aabbcc`

Basic Usage¶

Generating IDs¶

from htmlgraph import generate_id

# Generate a feature ID
feature_id = generate_id("feature", "User Authentication")
# → "feat-a1b2c3d4"

# Generate a bug ID
bug_id = generate_id("bug", "Login fails on Safari")
# → "bug-12345678"

# Generate a track ID
track_id = generate_id("track", "OAuth Integration")
# → "trk-abcdef12"

Automatic Generation¶

When using the SDK or CLI, IDs are generated automatically:

from htmlgraph import SDK

sdk = SDK(agent="claude")

# ID generated automatically
feature = sdk.features.create(
    title="User Authentication"
).set_priority("high").save()

print(feature.id)  # → "feat-7f3a2b1c"

# CLI also generates hash-based IDs
htmlgraph feature create "User Authentication" --priority high
# Created: feat-9e8d7c6b

Hierarchical IDs¶

For sub-tasks, use hierarchical IDs:

from htmlgraph import generate_hierarchical_id

# Create parent feature
parent_id = generate_id("feature", "Auth System")  # → "feat-a1b2c3d4"

# Create sub-tasks
subtask1 = generate_hierarchical_id(parent_id, 1)  # → "feat-a1b2c3d4.1"
subtask2 = generate_hierarchical_id(parent_id, 2)  # → "feat-a1b2c3d4.2"

# Nested sub-tasks
nested = generate_hierarchical_id(subtask1, 1)     # → "feat-a1b2c3d4.1.1"

Parsing IDs¶

Extract components from any ID:

from htmlgraph import parse_id

# Parse a hash-based ID
result = parse_id("feat-a1b2c3d4.1.2")
# {
#     'prefix': 'feat',
#     'node_type': 'feature',
#     'hash': 'a1b2c3d4',
#     'hierarchy': [1, 2],
#     'is_legacy': False
# }

# Parse a legacy ID
result = parse_id("feature-20241222-143022")
# {
#     'prefix': 'feature',
#     'node_type': 'feature',
#     'hash': '20241222-143022',
#     'hierarchy': [],
#     'is_legacy': True
# }

Validation¶

Check if IDs are valid:

from htmlgraph import is_valid_id, is_legacy_id

# Hash-based IDs
is_valid_id("feat-a1b2c3d4")      # → True
is_valid_id("feat-a1b2c3d4.1.2")  # → True

# Legacy IDs (still valid)
is_valid_id("feature-20241222-143022")  # → True

# Invalid IDs
is_valid_id("invalid")            # → False
is_valid_id("feat-xyz")           # → False (not hex)

# Check legacy format
is_legacy_id("feature-20241222-143022")  # → True
is_legacy_id("feat-a1b2c3d4")            # → False

Hierarchy Helpers¶

Navigate hierarchical IDs:

from htmlgraph.ids import get_parent_id, get_root_id, get_depth

id = "feat-a1b2c3d4.1.2"

get_parent_id(id)  # → "feat-a1b2c3d4.1"
get_root_id(id)    # → "feat-a1b2c3d4"
get_depth(id)      # → 2

# Root IDs have no parent
get_parent_id("feat-a1b2c3d4")  # → None
get_depth("feat-a1b2c3d4")      # → 0

Collision Resistance¶

With 4 bytes of random entropy (default), the probability of collision is approximately 1 in 4 billion per ID generated. Combined with microsecond timestamps and title hashing, collisions are effectively impossible even with thousands of concurrent agents.

# Generate 1000 IDs with identical titles
ids = [generate_id("feature", "Same Title") for _ in range(1000)]
unique = len(set(ids))
print(f"Generated {unique} unique IDs")  # → "Generated 1000 unique IDs"

Backward Compatibility¶

Legacy timestamp-based IDs remain fully supported:

Existing features with old IDs continue to work
parse_id() correctly identifies legacy format
is_valid_id() accepts both formats
No migration required

API Reference¶

generate_id ¶

generate_id(node_type: str = 'feature', title: str = '', entropy_bytes: int = 4) -> str

Generate a collision-resistant ID.

Format: {prefix}-{hash} (e.g., feat-a1b2c3d4)

The hash is derived from: - Title (for some content-addressability) - Timestamp (microsecond precision in UTC) - Random bytes (entropy)

PARAMETER	DESCRIPTION
`node_type`	Type of node (feature, bug, chore, etc.) TYPE: `str` DEFAULT: `'feature'`
`title`	Node title (used in hash for content-addressability) TYPE: `str` DEFAULT: `''`
`entropy_bytes`	Number of random bytes to include (default 4) TYPE: `int` DEFAULT: `4`

RETURNS	DESCRIPTION
`str`	A collision-resistant ID like "feat-a1b2c3d4"

Example

generate_id("feature", "User Authentication") 'feat-7f3a2b1c' generate_id("bug", "Login fails on Safari") 'bug-9e8d7c6b'

Source code in src/python/htmlgraph/ids.py

def generate_id(
    node_type: str = "feature",
    title: str = "",
    entropy_bytes: int = 4,
) -> str:
    """
    Generate a collision-resistant ID.

    Format: {prefix}-{hash} (e.g., feat-a1b2c3d4)

    The hash is derived from:
    - Title (for some content-addressability)
    - Timestamp (microsecond precision in UTC)
    - Random bytes (entropy)

    Args:
        node_type: Type of node (feature, bug, chore, etc.)
        title: Node title (used in hash for content-addressability)
        entropy_bytes: Number of random bytes to include (default 4)

    Returns:
        A collision-resistant ID like "feat-a1b2c3d4"

    Example:
        >>> generate_id("feature", "User Authentication")
        'feat-7f3a2b1c'
        >>> generate_id("bug", "Login fails on Safari")
        'bug-9e8d7c6b'
    """
    prefix = PREFIXES.get(node_type, node_type[:4].lower())

    # Combine multiple sources of uniqueness
    timestamp = datetime.now(timezone.utc).isoformat()
    random_bytes = os.urandom(entropy_bytes)

    # Create hash from all sources
    content = f"{title}:{timestamp}".encode() + random_bytes
    hash_digest = hashlib.sha256(content).hexdigest()[:8]

    return f"{prefix}-{hash_digest}"

generate_hierarchical_id ¶

generate_hierarchical_id(parent_id: str, index: int | None = None) -> str

Generate a sub-task ID under a parent.

Format: {parent_id}.{index} (e.g., feat-a1b2c3d4.1)

If index is not provided, it auto-increments based on existing siblings (requires filesystem check).

PARAMETER	DESCRIPTION
`parent_id`	The parent node's ID TYPE: `str`
`index`	Sub-task index (1-based). Auto-assigned if None. TYPE: `int \| None` DEFAULT: `None`

RETURNS	DESCRIPTION
`str`	A hierarchical ID like "feat-a1b2c3d4.1" or "feat-a1b2c3d4.1.2"

Example

generate_hierarchical_id("feat-a1b2c3d4", 1) 'feat-a1b2c3d4.1' generate_hierarchical_id("feat-a1b2c3d4.1", 2) 'feat-a1b2c3d4.1.2'

Source code in src/python/htmlgraph/ids.py

def generate_hierarchical_id(
    parent_id: str,
    index: int | None = None,
) -> str:
    """
    Generate a sub-task ID under a parent.

    Format: {parent_id}.{index} (e.g., feat-a1b2c3d4.1)

    If index is not provided, it auto-increments based on
    existing siblings (requires filesystem check).

    Args:
        parent_id: The parent node's ID
        index: Sub-task index (1-based). Auto-assigned if None.

    Returns:
        A hierarchical ID like "feat-a1b2c3d4.1" or "feat-a1b2c3d4.1.2"

    Example:
        >>> generate_hierarchical_id("feat-a1b2c3d4", 1)
        'feat-a1b2c3d4.1'
        >>> generate_hierarchical_id("feat-a1b2c3d4.1", 2)
        'feat-a1b2c3d4.1.2'
    """
    if index is None:
        raise ValueError("index is required (auto-increment not yet implemented)")

    if index < 1:
        raise ValueError("index must be >= 1")

    return f"{parent_id}.{index}"

parse_id ¶

parse_id(node_id: str) -> dict[str, str | int | list[int] | None]

Parse an ID into its components.

PARAMETER	DESCRIPTION
`node_id`	The ID to parse TYPE: `str`

RETURNS	DESCRIPTION
`dict[str, str \| int \| list[int] \| None]`	Dictionary with:
`dict[str, str \| int \| list[int] \| None]`	prefix: The type prefix (e.g., "feat")
`dict[str, str \| int \| list[int] \| None]`	node_type: The full node type (e.g., "feature")
`dict[str, str \| int \| list[int] \| None]`	hash: The hash portion (e.g., "a1b2c3d4")
`dict[str, str \| int \| list[int] \| None]`	hierarchy: List of sub-indices (e.g., [1, 2] for ".1.2")
`dict[str, str \| int \| list[int] \| None]`	is_legacy: Whether this is an old-format ID

Example

parse_id("feat-a1b2c3d4.1.2") { 'prefix': 'feat', 'node_type': 'feature', 'hash': 'a1b2c3d4', 'hierarchy': [1, 2], 'is_legacy': False }

Source code in src/python/htmlgraph/ids.py

def parse_id(node_id: str) -> dict[str, str | int | list[int] | None]:
    """
    Parse an ID into its components.

    Args:
        node_id: The ID to parse

    Returns:
        Dictionary with:
        - prefix: The type prefix (e.g., "feat")
        - node_type: The full node type (e.g., "feature")
        - hash: The hash portion (e.g., "a1b2c3d4")
        - hierarchy: List of sub-indices (e.g., [1, 2] for ".1.2")
        - is_legacy: Whether this is an old-format ID

    Example:
        >>> parse_id("feat-a1b2c3d4.1.2")
        {
            'prefix': 'feat',
            'node_type': 'feature',
            'hash': 'a1b2c3d4',
            'hierarchy': [1, 2],
            'is_legacy': False
        }
    """
    # Try new hash-based format
    match = HASH_ID_PATTERN.match(node_id)
    if match:
        prefix = match.group(1)
        hash_part = match.group(2)

        # Extract hierarchy by finding all .N segments after the hash
        # The regex captures the last one, but we need all of them
        hierarchy = []
        base_id = f"{prefix}-{hash_part}"
        if len(node_id) > len(base_id):
            hierarchy_str = node_id[len(base_id) :]  # e.g., ".1.2"
            hierarchy = [int(x) for x in hierarchy_str.split(".") if x]

        return {
            "prefix": prefix,
            "node_type": PREFIX_TO_TYPE.get(prefix, prefix),
            "hash": hash_part,
            "hierarchy": hierarchy,
            "is_legacy": False,
        }

    # Try legacy format (feature-20241222-143022)
    legacy_match = LEGACY_ID_PATTERN.match(node_id)
    if legacy_match:
        prefix = legacy_match.group(1)
        timestamp = legacy_match.group(2)

        return {
            "prefix": prefix,
            "node_type": prefix,  # Legacy uses full type as prefix
            "hash": timestamp,
            "hierarchy": [],
            "is_legacy": True,
        }

    # Unknown format
    return {
        "prefix": None,
        "node_type": None,
        "hash": None,
        "hierarchy": [],
        "is_legacy": None,
    }

is_valid_id ¶

is_valid_id(node_id: str) -> bool

Check if an ID is valid (either new or legacy format).

PARAMETER	DESCRIPTION
`node_id`	The ID to validate TYPE: `str`

RETURNS	DESCRIPTION
`bool`	True if the ID matches a known format

Example

is_valid_id("feat-a1b2c3d4") True is_valid_id("feat-a1b2c3d4.1.2") True is_valid_id("feature-20241222-143022") True is_valid_id("invalid") False

Source code in src/python/htmlgraph/ids.py

def is_valid_id(node_id: str) -> bool:
    """
    Check if an ID is valid (either new or legacy format).

    Args:
        node_id: The ID to validate

    Returns:
        True if the ID matches a known format

    Example:
        >>> is_valid_id("feat-a1b2c3d4")
        True
        >>> is_valid_id("feat-a1b2c3d4.1.2")
        True
        >>> is_valid_id("feature-20241222-143022")
        True
        >>> is_valid_id("invalid")
        False
    """
    return bool(HASH_ID_PATTERN.match(node_id) or LEGACY_ID_PATTERN.match(node_id))

is_legacy_id ¶

is_legacy_id(node_id: str) -> bool

Check if an ID uses the legacy timestamp format.

PARAMETER	DESCRIPTION
`node_id`	The ID to check TYPE: `str`

RETURNS	DESCRIPTION
`bool`	True if this is a legacy format ID (e.g., feature-20241222-143022)

Example

is_legacy_id("feature-20241222-143022") True is_legacy_id("feat-a1b2c3d4") False

Source code in src/python/htmlgraph/ids.py

def is_legacy_id(node_id: str) -> bool:
    """
    Check if an ID uses the legacy timestamp format.

    Args:
        node_id: The ID to check

    Returns:
        True if this is a legacy format ID (e.g., feature-20241222-143022)

    Example:
        >>> is_legacy_id("feature-20241222-143022")
        True
        >>> is_legacy_id("feat-a1b2c3d4")
        False
    """
    return bool(LEGACY_ID_PATTERN.match(node_id))

get_parent_id ¶

get_parent_id(node_id: str) -> str | None

Get the parent ID for a hierarchical ID.

PARAMETER	DESCRIPTION
`node_id`	A hierarchical ID like "feat-a1b2c3d4.1.2" TYPE: `str`

RETURNS	DESCRIPTION
`str \| None`	The parent ID ("feat-a1b2c3d4.1") or None if not hierarchical

Example

get_parent_id("feat-a1b2c3d4.1.2") 'feat-a1b2c3d4.1' get_parent_id("feat-a1b2c3d4.1") 'feat-a1b2c3d4' get_parent_id("feat-a1b2c3d4") None

Source code in src/python/htmlgraph/ids.py

def get_parent_id(node_id: str) -> str | None:
    """
    Get the parent ID for a hierarchical ID.

    Args:
        node_id: A hierarchical ID like "feat-a1b2c3d4.1.2"

    Returns:
        The parent ID ("feat-a1b2c3d4.1") or None if not hierarchical

    Example:
        >>> get_parent_id("feat-a1b2c3d4.1.2")
        'feat-a1b2c3d4.1'
        >>> get_parent_id("feat-a1b2c3d4.1")
        'feat-a1b2c3d4'
        >>> get_parent_id("feat-a1b2c3d4")
        None
    """
    if "." not in node_id:
        return None

    return node_id.rsplit(".", 1)[0]

get_root_id ¶

get_root_id(node_id: str) -> str

Get the root ID (without hierarchy) for any ID.

PARAMETER	DESCRIPTION
`node_id`	Any ID, possibly hierarchical TYPE: `str`

RETURNS	DESCRIPTION
`str`	The root ID without hierarchy portion

Example

get_root_id("feat-a1b2c3d4.1.2") 'feat-a1b2c3d4' get_root_id("feat-a1b2c3d4") 'feat-a1b2c3d4'

Source code in src/python/htmlgraph/ids.py

def get_root_id(node_id: str) -> str:
    """
    Get the root ID (without hierarchy) for any ID.

    Args:
        node_id: Any ID, possibly hierarchical

    Returns:
        The root ID without hierarchy portion

    Example:
        >>> get_root_id("feat-a1b2c3d4.1.2")
        'feat-a1b2c3d4'
        >>> get_root_id("feat-a1b2c3d4")
        'feat-a1b2c3d4'
    """
    parsed = parse_id(node_id)
    if parsed["prefix"] and parsed["hash"]:
        return f"{parsed['prefix']}-{parsed['hash']}"
    return node_id.split(".")[0]

get_depth ¶

get_depth(node_id: str) -> int

Get the hierarchy depth of an ID.

PARAMETER	DESCRIPTION
`node_id`	Any ID TYPE: `str`

RETURNS	DESCRIPTION
`int`	0 for root IDs, 1+ for hierarchical IDs

Example

get_depth("feat-a1b2c3d4") 0 get_depth("feat-a1b2c3d4.1") 1 get_depth("feat-a1b2c3d4.1.2") 2

Source code in src/python/htmlgraph/ids.py

def get_depth(node_id: str) -> int:
    """
    Get the hierarchy depth of an ID.

    Args:
        node_id: Any ID

    Returns:
        0 for root IDs, 1+ for hierarchical IDs

    Example:
        >>> get_depth("feat-a1b2c3d4")
        0
        >>> get_depth("feat-a1b2c3d4.1")
        1
        >>> get_depth("feat-a1b2c3d4.1.2")
        2
    """
    parsed = parse_id(node_id)
    hierarchy = parsed.get("hierarchy", [])
    if isinstance(hierarchy, list):
        return len(hierarchy)
    return 0