Version |
stable |
|---|---|
Updated |
2026-05-25 |
Authors |
Ben Ramsey <ben@ramsey.dev> |
Common Interfaces and Factories for Identifiers
Abstract
This specification defines common interfaces for representing and creating identifiers in PHP. Identifiers are names that identify specific objects or types of objects. Software engineers have defined many different identifiers based on a variety of needs, such as uniqueness, authority, and scalability. Some popular identifiers defined as a result of these efforts include [UUID], [ULID], and [Snowflake] identifiers.
Final implementations MAY decorate identifiers with more functionality than proposed here, but they MUST implement the indicated interfaces/functionality described in this document.
Introduction
As programmers, we use identifiers everywhere. Without them, we wouldn’t be able to keep track of, for example, a user’s account or the location of a web page. To identify a user’s account, we might use an auto-incrementing integer created by a database; for a web page, we might use a URL. Both can uniquely identify the object they refer to, based on their context—within a specific database table, the user’s identifier can refer only to them, and on the web, a URL can refer only to a single web page.
Identifiers come in many forms. We’ve mentioned two such forms: integers and
strings. Within these forms, identifiers might have metadata embedded within
them. From the URL, we can assume that it resolves to some information stored
on the web; we can determine the protocol, host name, and port number used to
access this information. Depending on the combination of characters in the URL,
we might also be able to determine the expected language of the information
(e.g., if the URL includes a path segment like /en-US/) or the title of the
page or the date of the article, if the URL includes this information in a
human-readable form. On the other hand, the user’s account identifier might not
include any obvious metadata at all. It is a simple integer, with no context.
We can decipher information from the URL because there is a specification for how to form them. Other data included in URLs might be intuited from context clues—if part of the URL appears to have a formatted date-time string, then that date is likely related to the content at that URL, in some way.
Other types of identifiers might include non-obvious metadata that we can only decipher if we know how they were created. If we know the user’s account identifier is a [Snowflake], then we know we can extract the time it was created and possibly other information, such as the machine or worker that created it.
Most identifiers share a handful of common traits, and we tend to use them similarly, despite how different they might appear on the surface. We might compare them for equality or sorting. We might convert them to a binary representation for storage or extract the date to find when they were created. This specification attempts to capture these commonalities, providing applications and libraries a means to create and pass identifiers, without being locked-in to a specific implementation.
Terminology
Requirements Language
The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “NOT RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.
Definitions
- Binary
A base-2 numeral system, where each position has a value of either zero (0) or one (1), also known as a bit. In the context of this specification, binary data refers to a byte string.
- Byte string
A sequence of bytes in which each byte is a sequence of 8 bits, with 256 possible values. A byte string is sometimes referred to as an octet stream.
- Date-time
A representation of a point in history as a date and time, usually using an instance of
\DateTimeInterfacein PHP.- Identifier
“[A] name that identifies (that is, labels the identity of) either a unique object or a unique class of objects, where the ‘object’ or class may be an idea, physical countable object (or class thereof), or physical noncountable substance (or class thereof)” [Wikipedia].
Usage
Creating Identifiers
Applications SHOULD use factories to create identifiers, wherever possible.
$identifier = $factory->create();
Sorting Identifiers
This specification supports in-application sorting of identifiers. For example:
usort($identifiers, fn (Identifier $a, Identifier $b): int => $a->compareTo($b));
Get the Timestamp
DateTimeIdentifier instances support extracting the date-time value from the
identifier. The return value of getDateTime() is a PHP \DateTimeImmutable
instance.
$timestamp = $identifier->getDateTime()->getTimestamp();
Interfaces
Identifiers
Identifier
The identifier interface defines the most basic operations all identifiers support. These basic operations provide the ability to compare identifiers and convert the identifier to a string for presentation.
To provide comparison, implementations MUST implement compareTo(), which
MUST accept any value for comparison, returning an integer less than, equal to,
or greater than zero, if the identifier is less than, equal to, or greater than
the value. This allows for sorting identifiers.
Likewise, equals() MUST accept any value for comparison and return boolean
true if the identifier is equal to the value and false otherwise.
If an implementation wishes to indicate that it cannot compare the identifier
to the provided value, it MUST throw an \Identifier\Exception\NotComparable
exception.
The toString() method SHOULD return a human-readable representation of the
identifier (e.g., integer string, UUID, hexadecimal-encoded, Base64-encoded,
etc.).
namespace Identifier;
use JsonSerializable;
use Stringable;
/**
* Defines the most basic operations all identifiers support
*/
interface Identifier extends JsonSerializable, Stringable
{
/**
* Returns an integer less than, equal to, or greater than zero if the
* identifier is less than, equal to, or greater than the other value
*
* @throws Exception\NotComparable MUST throw if the implementation is
* unable to make comparisons with the other value provided
*/
public function compareTo(mixed $other): int;
/**
* Returns true if the identifier is equal to the other value
*
* @throws Exception\NotComparable MUST throw if the implementation is
* unable to make comparisons with the other value provided
*/
public function equals(mixed $other): bool;
/**
* Returns a string representation of the identifier, formatted for use
* with JSON
*
* This MUST be aliased to `__toString()`; the serialized JSON form is a
* string.
*
* @see self::__toString()
*
* @return non-empty-string
*/
public function jsonSerialize(): string;
/**
* Returns a string representation of the identifier
*
* This MUST be aliased to `__toString()`. It is provided for
* convenience.
*
* @see self::__toString()
*
* @return non-empty-string
*/
public function toString(): string;
/**
* Returns a string representation of the identifier
*
* @return non-empty-string
*/
public function __toString(): string;
}
BytesIdentifier
A bytes identifier is an identifier that is based on arbitrary binary data. The
toBytes() method MUST return the byte string representation of the
identifier.
Bytes identifiers are useful especially when the size of the identifier overflows the system limitations for maximum or minimum integers. For example, UUIDs and ULIDs, both 128-bit integers, may be represented as bytes identifiers.
namespace Identifier;
/**
* An identifier based on arbitrary binary data
*/
interface BytesIdentifier extends Identifier
{
/**
* Returns the identifier as a raw byte string
*
* @return non-empty-string
*/
public function toBytes(): string;
}
DateTimeIdentifier
A date-time identifier is based on a date-time value or otherwise has a date-time value associated with it that may be extracted from the identifier.
namespace Identifier;
use DateTimeImmutable;
/**
* An identifier based on a date-time value
*/
interface DateTimeIdentifier extends Identifier
{
/**
* Returns a date-time representation of the timestamp associated with
* this identifier
*/
public function getDateTime(): DateTimeImmutable;
}
IntegerIdentifier
Integer identifiers are identifiers that may be represented as integers.
Implementations MAY support identifiers greater than PHP_INT_MAX and less
than PHP_INT_MIN. In this case, the toInteger() method SHOULD return a
string value. If the return value is a string, it MUST be an integer string.
If an implementation does not support identifiers greater than PHP_INT_MAX
or less than PHP_INT_MIN, and the toInteger() operation attempts to
return an integer outside these boundaries, it MUST throw an
\Identifier\Exception\OutOfRange exception.
namespace Identifier;
/**
* An identifier that may be represented as an integer
*/
interface IntegerIdentifier extends Identifier
{
/**
* Returns an integer representation of the identifier
*
* @return int | numeric-string The integer value or string integer
* representation of the identifier.
*
* @throws Exception\OutOfRange MUST throw if the implementation does
* not support integers outside the range of `PHP_INT_MIN` and
* `PHP_INT_MAX` and the identifier evaluates to an integer outside
* this range
*/
public function toInteger(): int | string;
}
Identifier Factories
IdentifierFactory
IdentifierFactory defines a common interface for factories that create
identifiers.
Descendants of IdentifierFactory MAY specify a narrower return type for the
create() method.
namespace Identifier;
/**
* Creates identifiers
*/
interface IdentifierFactory
{
/**
* Creates a new instance of an identifier
*/
public function create(): Identifier;
}
BytesIdentifierFactory
BytesIdentifierFactory defines a common interface for factories that create
identifiers from raw byte strings.
namespace Identifier;
/**
* Creates identifiers based on arbitrary binary data
*/
interface BytesIdentifierFactory extends IdentifierFactory
{
public function create(): BytesIdentifier;
/**
* Creates a new instance of an identifier from the given byte string
* representation
*
* @param non-empty-string $identifier A binary octet string encoded
* according to the specification for the type of identifier
*
* @throws Exception\InvalidArgument MUST throw if $identifier is not a
* legal value
*/
public function createFromBytes(string $identifier): BytesIdentifier;
}
DateTimeIdentifierFactory
DateTimeIdentifierFactory defines a common interface for factories that
create identifiers from date-time values.
namespace Identifier;
use DateTimeInterface;
/**
* Creates identifiers based on date-time values
*/
interface DateTimeIdentifierFactory extends IdentifierFactory
{
public function create(): DateTimeIdentifier;
/**
* Creates a new instance of an identifier from the given date-time
*
* @param DateTimeInterface $dateTime The date-time to use when creating
* the identifier
*
* @throws Exception\InvalidArgument MUST throw if $dateTime is not a
* legal value
*/
public function createFromDateTime(
DateTimeInterface $dateTime,
): DateTimeIdentifier;
}
IntegerIdentifierFactory
IntegerIdentifierFactory defines a common interface for factories that
create identifiers from integers.
namespace Identifier;
/**
* Creates identifiers that may be represented as integers
*/
interface IntegerIdentifierFactory extends IdentifierFactory
{
public function create(): IntegerIdentifier;
/**
* Creates a new instance of an identifier from the given integer
* representation
*
* @param int | numeric-string $identifier The integer value or string
* integer representation of the identifier.
*
* @throws Exception\InvalidArgument MUST throw if the identifier is not
* a legal value
* @throws Exception\OutOfRange MUST throw if the implementation does
* not support integers outside the range of `PHP_INT_MIN` and
* `PHP_INT_MAX` and the identifier evaluates to an integer outside
* this range
*/
public function createFromInteger(int | string $identifier): IntegerIdentifier;
}
StringIdentifierFactory
StringIdentifierFactory defines a common interface for factories that
create identifiers from strings.
namespace Identifier;
/**
* Creates identifiers from string representations
*/
interface StringIdentifierFactory extends IdentifierFactory
{
/**
* Creates a new instance of an identifier from the given string
* representation
*
* @param non-empty-string $identifier The string representation of the
* identifier is specific to the type of identifier and
* implementation; for example, UUIDs use a specific format, while
* other identifiers may use other formats
*
* @throws Exception\InvalidArgument MUST throw if the identifier is not
* a legal value
*/
public function createFromString(string $identifier): Identifier;
}
Exceptions
IdentifierException
This is a base exception from which all identifier exceptions descend. If an implementation uses custom exception types, they MAY implement this interface.
namespace Identifier\Exception;
use Throwable;
/**
* Base interface representing generic exceptions for identifiers
*/
interface IdentifierException extends Throwable
{
}
InvalidArgument
Identifier factory methods that accept arguments MUST throw InvalidArgument
exceptions if any of the arguments are not legal values for the method.
namespace Identifier\Exception;
/**
* The argument passed is invalid
*/
interface InvalidArgument extends IdentifierException
{
}
NotComparable
If an implementation cannot or chooses not to compare a given value to an
identifier (through Identifier::compareTo() or Identifier::equals()), it
MUST throw a NotComparable exception.
namespace Identifier\Exception;
/**
* The given value cannot be compared to the identifier
*/
interface NotComparable extends IdentifierException
{
}
OutOfRange
If an implementation does not support integer values outside the range
[ PHP_INT_MIN .. PHP_INT_MAX ], it MUST throw an OutOfRange exception when
encountering integers outside this range as input arguments to
IntegerIdentifierFactory::createFromInteger() or when evaluating return
values for IntegerIdentifier::toInteger().
namespace Identifier\Exception;
/**
* The value is out of the range of acceptable values
*/
interface OutOfRange extends IdentifierException
{
}
Providers
Packages that provide an identifier/identifier implementation identify themselves
by specifying the provide property in composer.json with the package name
"identifier/identifier-implementation". For example:
{
"provide": {
"identifier/identifier-implementation": "*"
}
}
You may find these packages on Packagist.
References
Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels”, BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, <https://www.rfc-editor.org/info/rfc2119>.
Saint-Andre, P. and J. Klensin, “Uniform Resource Names (URNs)”, RFC 8141, DOI 10.17487/RFC8141, April 2017, <https://www.rfc-editor.org/info/rfc8141>.
Twitter, “Snowflake is a network service for generating unique ID numbers at high scale with some simple guarantees.”, Commit b3f6a3c, May 2014, <https://github.com/twitter-archive/snowflake/releases/tag/snowflake-2010>.
Feerasta, A., “Universally Unique Lexicographically Sortable Identifier”, Commit d0c7170, May 2019, <https://github.com/ulid/spec>.
Davis, K., Peabody, B., and P. Leach, “Universally Unique IDentifiers (UUIDs)”, RFC 9562, DOI 10.17487/RFC9562, May 2024, <https://www.rfc-editor.org/info/rfc9562>.
Wikipedia, “Identifier”, Page version ID 1122384949, November 2022, <https://en.wikipedia.org/w/index.php?title=Identifier&oldid=1122384949>.
Public Domain Dedication
To the extent possible under law,
Ben Ramsey
has waived all copyright and related or neighboring rights to
Common Interfaces and Factories for Identifiers.