PR #213 has generated an interesting comment stream, and I think some useful clarity. I am happy to have multiple smart people agree in writing to the concept that a DID can identify anything, because this flexibility seemed to have been excluded by some verbiage I was hearing.
Now I'd like to explore a subtlety around the concept of control. I will frame this in terms of a use case that I'm familiar with in cybersecurity and malware research, but I think you'll quickly see how it might apply to use cases brought up by others.
Malware researchers typically identify malware (viruses, worms, infected or malicious files) by a sha256 hash. The first time a particular sample is seen in the wild, a researcher hashes the sample and goes to virustotal.com or some similar site to see if anybody else has seen it before. If no, the sample is uploaded to the site's DB for all the world to look at. If it is already known, then the researcher has just made a second (or a third, or a tenth) independent discovery.
Now, suppose I wrote a DID method that was all about identifying malware with DIDs. The logical identifier format would be did:mymethod:hash-of-sample. With me so far?
Okay, now what are the control semantics?
What I have heard so far is that DIDs are always created by a controller, who can then (even in the genesis DID doc) choose to retain control or give it away (e.g., by specifying no control after the creation transaction). This makes sense for many situations.
However, that doesn't quite fit this scenario, because A) the researcher who reports the malware is never, at any time, in a "control" relationship with the sample's identifier, and would not want to be considered so; B) the identifier cannot have control semantics, even at its genesis transaction, because its derivation mechanism disallows it; C) the identifier doesn't have a DID doc. What's being identified here is content that exists, that is explicitly uncontrollable to begin with. Anybody who discovers the content will discover the same identifier. Two researchers could register the same content on two different systems of record and both would be equally valid and not in conflict.
So my question is this:
Would we be comfortable saying that DIDs can be used to identify such things, too? And if yes (which I hope is an easy answer), are we willing to not describe such a scenario as "the controller creates the DID" but rather "the DID identifies something inherently uncontrollable, so it never has a controller, even during creation; rather, it has a discoverer" (or something to that effect)?
Tagging @ewelton and @mitfik and @swcurran
This would impact issue #122 as well - in fact, it would force the selection of 1b and 2a.
There is always a controller. So 1b doesn't work.
At the bare minimum, whoever created the DID is the controller. It does not imply they are, in any way, a controller of the DID Subject. All it means is that they controlled the initial DID Document, and presumably--depending on the method--retain the ability to further modify the document.
In the malware use case, I believe a better way to model that is that the initial reporter generates a DID and issues a credential saying that malware with a given hash has been given such and such a DID, perhaps with other corroborating claims in that credential. No control relationship needs to be established between the DID Controller and the DID Subject. But no matter what the relationship between Subject and Controller are, there is ALWAYS a Controller, whether or not there is a controller property.
Alternatively, the malware discoverer could just issue a credential with that hash, without using DIDs. I'm not sure DIDs buy the use case much.
@jandrieu : Let me challenge "there is always a controller" a slightly different way. If a DID doc is created (said differently, if any metadata is associated with a DID), then I agree that there is always a controller, at least at the outset. Whoever creates the DID doc (chooses the metadata) is the controller. They can make decisions about whether to continue being the controller, or to disclaim control over the content by removing control methods. Furthermore, I agree that all our thinking up until now has assumed that DIDs and DID docs are inseparable concepts, because <assumption>of course we want metadata</assumption>.
But what I'm suggesting is a scenario where an identifier needs to be created (perhaps better said, it gets discovered) without a DID doc--zero metadata--from the get-go. Nobody is allowed to define any metadata about the identifier -- even at the outset. The need here is a pure identifier that has the decentralization characteristic of DIDs, but not the resolution characteristic. It's almost like a hashlink, except it makes no claim about location (or any other properties), only about existence. What is known about malware (metadata) could vary in thousands of different ways, and be stored in databases all over the world, and nobody intends to be an authoritative source for any of it. They just want to agree that they're talking about the same thing. Full stop. And since the mechanism for generating the identifier derives it from the subject, there can never be any controller making decisions, by definition. Uncontrollable things exist, and we identify them. Since there is no resolution, there is no controller. That's what the controller controls--the DID doc/resolution, n'est-ce pas?
Now, you could say, "No. We must have a DID doc. 99.999% of DIDs are worthless without it. For the weird corner case you're bringing up, if it really exists at all, just keep the convention and live with the weirdness." That would mean that we analyze the first person to create the guaranteed-empty DID doc for malware X as its "controller" for the purposes of the DID ecosystem. But two researchers could discover it independently, with no way of proving who was first. So we have a theoretical controller role that is unavoidably ambiguous, just because we want to keep the concept of controller. If we instead say, "Yep, there's cases where something exists but its metadata is not controlled, and DIDs can point to them. In such cases, it becomes impossible to create a DID doc (because if you do that, by definition you're exercising control), but it's still a sort of DID because it's a decentralized identifier" then we get to broaden the conceptual tent of DIDs a bit.
It is not clear to me yet how this interacts with methods - perhaps some methods are capable of representing passive, persistent objects, and other methods are not - because the owner of a set of keys may always be able to update "the DID document".
On the other hand - they can't update the did itself once it is minted - and that is what matters. So the controller (of the registration) might only be able to update information about the subject's registration record.
Another thing that I worry about is that a did:<method>:<data> model - if the <data> is related to the genesis key pair, then that method might not be capable of representing the "virus hash" you described above. The virus hash could only be represented as an assertion (or a VC)
In other words, for methods where the <data> part of the DID is derived from the genesis key pair, then the document belongs to the discoverer and the ability to represent an arbitrary thing, in a self-certifying manner, is simply not possible in that method. Those methods are constrained to represent "loci of control", which is an undeniably critical group - and the one which has come to dominate our thinking and discussion of DIDs.
This is especially apparent in the context of "verification methods" (i.e. #190 ) when combined with the new abstract-data-model/registry approach. The ADM/Registry model forces a "union of all possible realities" model and results in very complicated modeling. For example, we will need to put this sort of information in the registry:
capabilityDelegation field, if present, means <x>, but it MUST not be present when the method supports non-key hash registration in the data component of the DID and the data component of the DID is not directly related to the control mechanics of the underlying method. If, the method does not support arbitrary hash registration, then the capabilityDelegation field MAY be used, subject to the definition above.assertionMethod field, if present, means <x>, but it MUST not be present, when the method supports non-key hash registration in the data component of the DID and the data component of the DID is not directly related to the control mechanics of the underlying method. If, the method does not support arbitrary hash registration, then the assertionMethod field MAY be used, subject to the definition above.or the ADM/Registry needs structure like
method1, method2, etc.), thencapabilityDelegation field MUST NOT be usedassertionMethod field MUST NOT be usedcapabilityDelegation field is OPTIONAL and means <x>assertionMethod field is OPTIONAL and means <x>use of an @context field simplifies this substantially by allowing a DID-document to declare the semantic which applies - as in "This subject represents a locus of control" or "This subject was discovered and represents an external entity" - and, of course, depending upon the method - it may or may not be possible to render the DID document formally immutable - in which case, an actor capable of updating the DID document could "morph" the "sort of thing" the DID represents.
What this suggests to me - in answer to @dhh1128 's question
Would we be comfortable saying that DIDs can be used to identify such things, too? And if yes (which I hope is an easy answer), are we willing to not describe such a scenario as "the controller creates the DID" but rather "the DID identifies something inherently uncontrollable, so it never has a controller, even during creation; rather, it has a discoverer" (or something to that effect)?
Is that it is far from clear whether or not DIDs are suitable as generic identifiers for self-certified content. Perhaps DIDs are always and only statements by actors, about people, organizations, and things - which means
Alternatively we could move the semantics partially into methods. Perhaps we could have did-core define a set of "classes" of DIDs - each with it's own ADM/Registry/@context and let methods subscribe to them somehow - perhaps with an @class attribute which names the appropriate did-core semantic model. If we did that we could possibly
The most radical suggestion would be to step out of the battle altogether, and give DID-documents a sort of "sovereignty" and let them announce what they are and how to process them using some sort of attribute that identified and advertised feature and property sets. The proposed attribute would let the creator of the DID-document assert things like
and on and on - at the discretion of the environment and suitable to the needs of adopters.
We could even say "if a DID-document says nothing, then it is assumed to follow the rules in did-core" and provide a fallback Abstract Data Model that clearly defines what it ought to be.
@jandrieu re: 1b - i believe 1b is specific to the controller attribute in the DID-Doc, not the qualitative ability to control the DID-doc, simply the explicit representation of it in the DID-doc.
if there is always a controller, then the hypothesis that starts this is not possible. DIDs can not represent immutable content, they can only represent loci of control - and as such they can not really refer to things - they can only refer to the controllers name for things.
In other words - "The Moon" can not be the subject of a DID I create, "What Eric Thinks of as The Moon" is a proper scope, but "The Moon" is not.
Quoting Joe from issue #122 :
DID Controller is a functional definition. Any entity that can actually control the DID Document is a controller.
So if my theoretical DID method exists, in which it's impossible to place any metadata in the DID Document, or to create one at all, that would imply that there cannot be a controller, because nobody can perform the function that satisfies the definition.
This begs the question, of course, is the DID method that I posited allowed to exist? I can think of many use cases for it. It's highly decentralized (would score great on many rubrics), but by its lack of resolution support, it is definitely an odd duck.
What you are talking about is not a DID. It's just an identifier.
Obviously, there is still a discussion going on about what constituted meta-data. And, to my mind, I want ALL meta-data out of the DID Document. What needs to be in the DID Document is the cryptographic material for secure interaction (everything else is meta). In some cases, that material can be deterministically derived from the DID itself, like with did:key, in which case resolving the DID is how you transform the raw DID into the DID Document.
I think a big part of what's happening right now is people wanting to do EVERYTHING with DIDs, and I agree DIDs can refer to ANY subject. But that doesn't mean they are the right tool for every single identifier use case nor is it appropriate to pollute the core spec to support convenience features. They can be addressed in DID-AWESOME instead of DID-CORE.
If your identifier is most appropriately generated by hashing the object, GREAT. Just use that as an identifier. No DID required.
The fundamentally topological shift in DIDs over other forms of identifiers, including cryptographically verifiable ones like public keys, is the level of indirection between the DID and the cryptographic material, allowing for appropriate maintenance like rotation without invalidating the DID and auditing of transitions in material over the lifetime of the DID. Without that level of indirection, which is the fundamental link between DIDs and DID Documents, then you don't have DIDs, you just have an identifier.
@ewelton wrote
In other words - "The Moon" can not be the subject of a DID I create, "What Eric Thinks of as The Moon" is a proper scope, but "The Moon" is not.
That's all it ever could be.
The singular notion of "The Moon" doesn't exist. That is just what English speaking people, aka Eric, sometimes use to refer to the Earth's natural satellite. Other people use other terms.
This is the fundamental shift that VCs gaurantee. All you can ever say are statements that "some issuer asserts some 'fact'", which is exactly the structure above. This is epistemologically rigorous. Imagining that "The Moon" is, in absolute knowable truth, the subject of a given DID is not. In order for such a statement to exist, we would first have to rigorously understand what "The Moon" really means to you. Then what it really means to me. Then we might be able to convince ourselves that we are talking about the same thing.
It's the same with DIDs. The only way to know if the subject is what you think it is (unless you are the controller) is to gather enough assertions about that DID to convince you of what the Subject is. And EVEN then, all you have done is convince yourself.
Reality is fundamentally unknowable. All we can do is invest resources convincing ourselves of enough shared agreement to interact reasonably.
So, this isn't about a search for Truth with a capital "T". That's a fools errand. Rather, DIDs are a rigorous mechanism to establish cryptographically secured interactions with an arbitrary Subject. Figuring out what that Subject is or is not happens at another layer, including the mechanisms that embody what it means to "interact" with the Subject.
@jandrieu I believe there is more to it than 'just an identifier' - it is more than a UUID, because it is linked to the thing itself. It is suitable only to 'hashable' objects, and not physical objects. You can't hash a tree, you can't hash the moon - and, you can argue that you can not refer directly to "the moon" - there is a huge tradition in philosophical semantics about exactly this - and DIDs, in a sense are taking a deep philosophical stance.
So far - what seems like it works is this:
1 - DIDs can not be used to identify digital content in a shared namespace
2 - DIDs can not refer to things
3 - DIDs can be a specific actor/agent's name for a thing
In a sense, it does not matter where this falls - just as long as it falls somewhere and leads to clear and precise (and simple) language. So "the subject is the king of england" for example, would not be quite optimal "actor-x's name for the king of england is did:123" would be the right way to say it.
So "the subject is the king of england" for example, would not be quite optimal "actor-x's name for the king of england is did:123" would be the right way to say it.
Yes. That's what DIDs always say. But since we ALSO don't know who the Controller is, the statement "Controller's name for a thing is XYZ" is rigorously restatable as "A thing is the subject of DID XYZ"
The assertion that DID XYZ refers to the King of England goes in a VC if you want it to be rigorous, in which case you get the lovely construct that "Issuer ABC says DID XYZ is the King of England".
@jandrieu Nah, I don't quite agree with that. I would agree that saying "A thing is the subject of DID XYZ", while technically rigorous, leads to exactly the sort of miscommunication the community has been having.
I'm not sure I follow the VC comment. Who is ABC and how is ABC related to the construct?
What I'm trying to get to is making it clear, in everyday language, so that it is always apparent that A thing might have dozens of DIDs, because DIDs are "scoped" by controllers - and DIDs can not always serve as points of coordination in a discussion.
What we want for the VC case, and what is being discussed here, is that - given the limitations of DIDs and the incorrect statements about their scope for the last few years - is a new form of identifier that can be shared by communities, and around which we can clearly say "The controller of DID XYZ says DID XYZ refers to N" and "The controller of DID ABC says DID ABC refers to N" and then let DID ABC and DID XYZ rest happy that they are talking about the same N, so that they can have fruitful discussions about attributes of N, such as "cn=King of England" vs. "cn=King of Great Britain"
Okay. I think Joe's given a succinct articulation of a position on the proper scope of DIDs. Thank you, Joe. I love the crispness.
I would like to ask for two things to resolve this issue:
Before we poll the group, however, I would like to offer an alternative formulation to Joe's. I don't know if I can be as crisp as he was, but I'm going to try. Going into this, let me acknowledge that the following is heresy, according to the spec; I'm only articulating it because I wonder if we're missing an opportunity here, if we could let go of tightly held notions a little. Here's the alternative worldview:
Lots of identifier schemes already exist. They have various properties. DIDs are unique in that they accomplish ALL of the following goals simultaneously:
UUIDs accomplish goal 1, but not goal 2 or 3. A given UUID can mean anything, to anybody. Fred can create it, and Jill can repurpose it. They can argue about who's right, or whether they're both right. There is no strong binding to anything in particular. Most decentralized identifiers (e.g., the names of newly born children) are similar.
IP addresses accomplish goal 2, and sometimes goal 3, but not goal 1. Most centralized systems (twitter handles, phone numbers, domain names) are similar.
DIDs accomplish goal 1 in lots of clever ways that I won't go into here.
DIDs accomplish goal 2 in one of these ways:
a. They use cryptography to bind the identifier to a controller. The controller then defines what the identifier refers to. This was the original use case for DIDs, and the one we've thought about the most.
b. They define some other intrinsic property that is objectively observable, that derives the value of the identifier, such that it is impossible for the binding to be ambiguous. A DID that identifies each element in the periodic table by its atomic number would eliminate ambiguity without having cryptographic control, while still remaining decentralized, and while still being enough of a DID to be processed by DID handlers.
Notice that in this formulation, cryptographic control is a means to an end (eliminating ambiguity), not an end in and of itself. Notice also that cryptographic control is just a special case of the other approach (objectively observable property that makes the binding unambiguous). I think that's the crux of the difference between this worldview and the other one.
DIDs accomplish goal 3 through the use of the DID method extension mechanism.
Now that I've articulated an alternate worldview, here's the argument I'd offer in its favor: Although the world needs control-based binding for DIDs in the worst way, it also needs the other kind of binding (which I might call inherent binding). Both bindings are worthy of the moniker "decentralized identifier." UUIDs are not a good alternative because they lack the solution for ambiguity. URLs are not a good alternative because they lack decentralization of domain names. If we force the conception of DIDs to be narrow, we're setting ourselves up for a situation where another type of decentralized identifier comes along that has just as much claim to the word "decentralized", but that thinks about control differently. Result = muddiness and doubt about adoption. If we bring this ugly stepchild into the DID tent and let it take a bath, I suspect it will turn out to be cute and a good family member, in time. I don't think it would take much more than 2 or 3 paragraphs to talk about "uncontrolled decentralized identifiers" in the spec; they're way simpler than the controlled variant.
Tagging a few people who may have opinions about this interesting conversation: @peacekeeper @dlongley @msporny @burnburn @brentzundel @talltree . Please bring in others as appropriate.
@jandrieu I think i passed over https://github.com/w3c/did-core/issues/233#issuecomment-600843933 while I was writing my response. and to @dhh1128 's
Now that I've articulated an alternate worldview, here's the argument I'd offer in its favor: Although the world needs control-based binding for DIDs in the worst way, it also needs the other kind of binding. UUIDs are not a good alternative because they lack the solution for ambiguity. URLs are not a good alternative because they lack decentralization of domain names. If we force the conception of DIDs to be narrow, we're setting ourselves up for a situation where another type of decentralized identifier comes along that has just as much claim to the word "decentralized", but that thinks about control differently. Result = muddiness and doubt about adoption. If we bring this ugly stepchild into the DID tent and let it take a bath, I suspect it will turn out to be cute and a good family member, in time. I don't think it would take much more than 2 or 3 paragraphs to talk about "uncontrolled decentralized identifiers" in the spec; they're way simpler than the controlled variant.
I think this is exactly right it was what I was trying to capture with, what we want
is a new form of identifier that can be shared by communities, and around which we can clearly say "The controller of DID XYZ says DID XYZ refers to N" and "The controller of DID ABC says DID ABC refers to N" and then let DID ABC and DID XYZ rest happy that they are talking about the same N, so that they can have fruitful discussions about attributes of N, such as "cn=King of England" vs. "cn=King of Great Britain"
in other words - there is a missing piece to the puzzle. DID's are not necessarily up to the task, unless there is some tweaking to the spec - some core, fundamental tweaking and clarity.
So far, attempts to discuss the missing puzzle piece get blocked by discussion of controllers, subjects, and very obtuse technical issues. Those discussions have cut off the forest and the larger view has been lost. I like the idea of "bringing it into the DID tent, and giving it a bath"
DIDs don't solve #2
In fact, I don't think #2 is possible in any construction. We can only clarify the DID and when we refer to the DID we can use an unambiguous string of characters.
However, any statements can get attached to that identifier, by any author, and there is no way to know--at the DID level--which statement is "correct". Even if one of the statements is signed by the Controller, you can't be certain that it is "correct". Heck, you can't even prove the controller is the Subject.
What you are bumping up against is essentially Goedel's incompleteness theorem. You can't disambiguate everything. There will always be statements that cannot be proven, no matter how convoluted our schemes may be.
All we can do is anchor assertions by specific issuers to understand (and document) what they are willing to assert about a Subject, as identified by a DID. Statements about the same DID can be taken to be intended as statements about the same Subject, but even then the statements themselves may be wrong.
Content-based hashes of arbitrary content are NOT DIDs because they cannot be resolved directly to some form of cryptographic material. You could, of course, create an IPFS DID Document and have a DID method that uses its content-based address, but that hash is of the DID Document, not of the resource.
IMO, if we are going to get closure on this spec, we need to stop trying to add everything that seems like it might be convenient, and we need to stop trying to construct crazy edge cases--ESPECIALLY if you have no use cases for it (as you put it @dhh).
Maybe others with more experience in standards development can chime in. I know that VCs almost didn't get done because of mid-process shifts to support ZKPs. The consensus was that was a good thing. But it still risked finishing within the required deadline. Kitchen sink engineering a solution that solves everyone's problems is, IMO, an anti-pattern in a standardization process.
We need to be here locking down the simplest feature set for maximum interoperability to do the fundamental thing that DIDs do: enable cryptographically robust management of identifiers without reliance on central registry entities to keep track of who controls what. EVERYTHING else is superfluous and deserves a critical evaluation about whether or not we can remove it and still achieve the fundamental requirement of this work. EVERY add-on is another lengthy drawn out debate, additional implementation complexity, and yet another point of confusion for anyone who wants to adopt the tech. So, let's stop with the add-ons and start focusing on what we can do to minimize the complexity rather than exploring how we can extend DIDs to do extra magic. If DIDs can do that magic, it is perfectly fine to add that at another layer or in the next iteration of the spec.
@jandrieu would you then be backing this
1 - DIDs can not be used to identify digital content in a shared namespace
2 - but allow https://github.com/w3c/did-core/issues/199#issuecomment-589401616
3 - clarify that DIDs can not refer to things, only a specific actor/agent's name for a thing
4 - summarize the structure as https://github.com/w3c/did-core/issues/199#issuecomment-588686738
does that seem right?
Um... no.
DIDs can identify ANYTHING. I've said this before, so I'm surprised you'd suggest I'd back that set of statements.
My particular point here is that the are mathematical guarantees we can affirm with DIDs. That's what the cryptography gives us. Anything more than that which we can mathematically guarantee should be achieved at another level.
@jandrieu ok, so it seems like we're stuck.
It may not be possible to discuss DIDs.
Either a DID subject refers to ANYTHING and NOT a name for a thing scoped by a controller. But I have a feeling that if I say that it represents ANYTHING then you will say that it is scoped by a controller. I am getting dizzy.
If I was trying to describe DIDs to clients and customers (which i have stopped doing by the way) I need to be able to say something - if I say that "the subject is the King of England" to them without clarifying that there is a controller involved, they get the wrong idea. So I try to say "the subject is scoped by the controller" and then you say "no, I am suprised you said that" - I really am totally at a loss.
A DId subject is both scoped by a controller and not scoped by a controller and it is sometimes anything and sometimes restricted. I just don't get which set of constraints are in play - other than jsut not what anyone else is saying.
DIDs don't solve #2...any statements can get attached to that identifier, by any author, and there is no way to know--at the DID level--which statement is "correct"... Even if one of the statements is signed by the Controller, you can't be certain that it is "correct". What you are bumping up against is essentially Goedel's incompleteness theorem. You can't disambiguate everything. There will always be statements that cannot be proven, no matter how convoluted our schemes may be.
Perhaps you read #2 a bit too fast?
I'm not interested in proving the correctness of arbitrary statements about an identifier. I agree that anybody can claim any attributes they want about anything, and that it's not useful/desirable for DIDs to facilitate that. In fact, the example scheme I proposed explicitly precludes the association of any statements with the identifier other than existence/scope of reference (the subject). I'm saying that it's a defining characteristic of DIDs that they prove the correctness of exactly one type of statement, which is an assertion about scope of reference -- and I'm claiming that is a generalization of the variant you like, which is scope as proved by cryptographic evidence. Control is only interesting as a mechanism of achieving the real goal, which is knowing with confidence what you're talking about. Your own verbiage "Even if one of the statements is signed by the Controller" presupposes that it's possible to ascertain truth about this subtopic; signing is just the mechanism for proving that the scope of reference is what the Controller, not some other entity, asserts. I think this is exactly what you meant when you said the DID subject can't be the moon, but can be what the controller thinks of as the moon.
While it is true that eliminating all ambiguity is impossible, and on a philosophical level, we can't even prove that we exist rather than being figments of one another's imaginations, I am very surprised to hear anybody claim that DIDs don't provide practical clarity about what the referent is. Elsewhere you have claimed that the referent is whatever the controller wants it to be. That's an unambiguous binding. Yes, it can change. Yes, the controller can do a lousy or inconsistent job of definition. But the fact remains that whatever scope of reference is embodied in the controller's choices constitute exactly and uncontroversially the referent for a DID at a point in time, if the binding is based on cryptographic control.
Maybe others with more experience in standards development can chime in. I know that VCs almost didn't get done because of mid-process shifts to support ZKPs
I agree that bringing this up and tackling it is a tradeoff. Eric is not alone in believing that if we don't broaden our conception, important use cases are lost. But that could be the right answer, and I would accept it if it's the will of the community (even though I continue to disagree with your other argument). So I, too, am curious to hear how other people would weigh it.
@ewelton I don't think we are stuck. We are just dealing with the fundamentals of what is knowable and what is provable. As such, we bump into issues of epistemology and Goedel's incompleteness theorem. There are bounds on what we can know and bounds on what we can prove. Any technology that purports to exceed those bounds should be considered with the same skepticism as claims of a perpetual motion machine.
That said, it is a different issue how we talk to regular folks. In the same way that it is hard to explain why perpetual motion machines will never work, it will be hard to explain the boundaries of what is knowable and provable.
@jandrieu I understand how you frame it and why you say what you are saying. But there are practical solutions to the problem @dhh1128 raised. More importantly, we just need to pick one and move forward.
What you are saying is true, but I feel you are simply missing the point of what we are saying, and are convinced that this is because we fail to appreciate your point.
The subject of a DID has no semantics - and, importantly, if the hash is cryptographically bound to the genesis key pair, then it CAN NOT serve the role of identifying digital content in a self-certifying manner. Instead, it can only be the "name" of a record that contains the target identifier.
What we are exploring is a way to augment that environment - to make self-certifying content identifiers first-class citizens. This exploration is not about mathematical provability or Cantor's Paradise.
In terms of did methods - we are starting to see 'strange methods' like did:key - which, one might argue, have a different relationship with 'controllerhood' than do blockchain-resident did methods with long running did-documents that can evolve over time and can engage in complex expressions of verification methods and service_endpoints.
The option on the table is to recognize some of those differences - and instead of rage against them, decide if that variation can be co-opted and exploited.
In a sense it does not matter which is chosen - as long as it is chosen soon, and precisely. There is a strong argument for disallowing this sort of "content-hash" immutable element - like did:immutable:<hash> - and there are arguments for it. It is not the case that it is fundamentally impossible due to the Principle of Least Action driving the inherent increase in Entropy we commonly experience as the Arrow of Time - it is a pragmatic decision for the spec.
An object in a decentralized network needs an identifier. The DID name itself "Decentralized Identifier" suggests that there should be room to include a solution in the DID spec.
... and, for the record, semantic objects should never be governed in a decentralized network. That is why _schema.org_, etc. are open-access and free of governance. If semantics are governed they simply won't be adopted.
I may be missing the point. I certainly don't understand what @dhh is trying to get at with disambiguating. But I also don't understand your previous comments. We _can_ talk about DIDs and your suggestion that I would support those three items you listed made it seem like you didn't understand my point. If you do, great.
I really don't understand how #2 is accomplished, in any identification architecture.
- Eliminate ambiguity: make the referent of the identifier completely uncontroversial.
@dhh later expands that to
I'm saying that it's a defining characteristic of DIDs that they prove the correctness of exactly one type of statement, which is an assertion about scope of reference -- and I'm claiming that is a generalization of the variant you like, which is scope as proved by cryptographic evidence.
I'm still not following. The referent is not scoped by the DID. Rather, a link to a certain set of cryptographic material is provided by a DID Method after resolution.
That's it. What's what DIDs do. Resolve up a DID and you'll get some cryptographic material that can be used to interact securely with "The Subject" whatever/whoever that is. Maybe it is the controller. Maybe it is not. It isn't well scoped at all. It can even change over time. It is completely ambiguous what it refers to.
The only DID that resolves ambiguity is this hypothetical did:immutable. Which doesn't seem like a DID at all to me. So, yes, you can change the definition of DIDs to add something like did:immutable. But you can't say DIDs have a primary function of removing ambiguity--and then use that to justify an argument FOR did:immutable--because no other DIDs do that.
Don't get me wrong: immutable ids are cool. iid:[hashtype][hash] seems like a reasonable thing to standardize. github.com/w3c-ccg/multihash seems like it's half-way there.
I just don't think that's a DID in any sense that this community has been working on.
Maybe I am missing something. In any case, I'm definitely not following the logic on how did:immutable and its kin is anything like other DIDs.
Also... I'm not raging. I'm just disagreeing. DIDs are a thing. They aren't everything. They don't solve all the identifier problems. They are not the right identifier for every kind of thing that might need an identifier. They are a particular type of identifier that might be useful for certain things. Their key distinction is the ability to find the current authoritative cryptographic material for interacting with the Subject of the DID.
Before DIDs, there was not a particularly good way to find such material, not in any definitive way, without reliance on a third party. PGP's web of trust was the best prior art in this area. DIDs are a huge advancement in the usability of cryptography for a large number of use cases. It would be great if we could just focus on getting this fundamental innovation in the books, so we can turn our attention to building the amazing services on top of DIDs that so many of us are excited about.
The name DID should really be DEI (Decentralized Entity Identifier). DID suggests that you can identify anything in a decentralized network. If an object identifier cannot be accommodated, the name DID is misleading which is a shame. We would also have to build out an entirely new standard for a DOI (Decentralized Object Identifier) which of course can be done.
In an ideal world of DIDs for everything in a decentralized network, you would have did:e:<hash> for Entity and did:o:<hash> for Object.
Which way are we going to go?
@jandrieu : I think we are talking past each other because we are talking about different manifestations of ambiguity -- and it might be because of my own clumsy language. If so, I apologize. Let me try again. And let me step away from DIDs for a minute; maybe a different context will help.
Suppose, one day, that Alice invents a brand new word: "habapookajar." She's at a party, and she applies it as an adjective to a person wearing expensive Italian clothes. Those who overhear her are pretty sure it means something sort of like "sophisticated" -- but they're not quite sure. Her meaning is ambiguous. Even if they ask Alice what she means, there's no guarantee she'll tell them the truth, or be able to give them a definition that perfectly embodies her intentions.
This is ambiguity, and I believe we're in alignment in suggesting that it's fundamentally unresolvable. Let's call that "type 1" ambiguity for a moment.
But at least we know who's the definitive authority on the meaning: Alice. Whatever she says it means, we have to accept. There's no ambiguity about that, right?
Or is there? Suppose there's another party a week later, and Bob is overheard using this word. Someone asks him if he got it from Alice, and he says "No, I invented it. Who's Alice?"
Although all ambiguity has things in common, this new ambiguity feels like it's worth putting into a second bucket. Let's call it "type 2" ambiguity. This is not ambiguity about what the word means; it's ambiguity about how to approach learning the word's meaning; we don't even know where to start.
No identity systems can resolve type 1 ambiguity.
A centralized system resolves type 2 ambiguity because the system is the acknowledged authority on the question of what the identifier refers to. That doesn't make the identifier's meaning perfectly clear (nothing can) -- but it removes any ambiguity about how to learn more. But type 2 ambiguity has always been a big problem in decentralized systems, because there is no such authority.
Part of the genius of DIDs is that they solve this problem. That's a hugely valuable innovation. We've explained that innovation in terms of cryptographic control, and if we choose to, we can continue to explain it that way. We can say that the problem is proving control, and the solution is cryptography.
But what I'm suggesting is that we can define the problem in a slightly more general way, and that this might have nice consequences. It would be a tradeoff, as you say.
Old problem statement: How do I prove control of the identifier?
Old answer: With cryptography.
New problem statement: How do I eliminate type 2 ambiguity?
New answer: So far we've imagined two ways. One is to prove control with cryptography. Another way is to derive an identifier from objectively observable properties that remove all ambiguity. Maybe we'll realize there are other ways, too.
I admit that this new formulation is a departure from the official party line. The arguments in favor of it that I'd offer are:
It explains cryptographic control's desirability from first principles, not as an end unto itself. That feels deeply true/correct to me.
It is open-ended, but not infinitely broad. It claims for DIDs all conceptual identifier territory that intends to be decentralized but not type 2 ambiguous. UUIDs and numerous naming schemes fall outside the scope for clear reasons, but other mechanisms could be discovered that have the defining properties. Maybe we'll learn something. It would be nice not to have to start a new standard when the one we've already built anticipates such possibilities.
It allows me to leverage the hard work that's been done on DIDs to solve a whole new set of problems that are currently ruled out by the insistence that DIDs must be based on control of the identifier. This set of problems has been simmering in the background of DIDs for several years now, with people never quite able to explain why they felt misaligned. It's now late in the process, but I finally feel some clarity about why the disconnect and what a solution might be. The identifier variant that derives from objective properties is a type of identifier that's "discovered" rather than "created", and I intuit (but cannot prove) that we may come to love that type of identifier and want it under the DID umbrella.
I don't think these three arguments are a slam dunk argument in favor of what I'm proposing. But I'm hoping that at least my worldview and my comments about ambiguity make better sense?
Spelt out, the two options are doi:<hash> or did:o:<hash> for an object identifier. That should probably be put to a community vote.
Wow, what a thread. That we can be having this deep a conversation about identifiers so fast that if I take my eye off the list for 2 days I miss the whole thing...amazing.
I just now have taken the time to read the whole thread, from top to bottom. Here's my thinking. As someone who has worked on DIDs from the very first version of this spec four years ago, let me put it this way:
did:key: came along, it was a shock to the system. A DID that did not have a DID document, but generated a DID document. Whoa! That was a head-banger. At first I said, "Hell no". But then I listened to folks and thought about the use case...and finally "widened my thinking" about what a "decentralized identifier" should be because clearly did:key: was valuable.did:web:. I still personally find it distasteful, but I can see the use cases, and even in that method, due to the presence of a DID document, there are still ways of using the cryptographically-verifiability to work around the fact the method has a highly centralized component.So that brings us to did:r:. When I first heard about the concept, I said, "Oh, it's just a content hash expressed as a DID. There's no DID document. There cannot be a DID document. So clearly that's not a DID." But then, like all three examples above, I stopped and listened to the use cases and started thinking about it. And I found myself agreeing with @dhh1128 that the overall concept is simply a different application of cryptographically-verifiable identification.
In other words, if you look at a did:r:, it is indeed cryptographically verifiable, but not through its cryptographic association with a public key, but through its cryptographic association with the DID subject.
That is still a cryptographically-verifiable identifier. And it's still decentralized.
And it's valuable because I have spent time with the proposers of this new method and they have a MOUNTAIN of use cases for it. Entire industries might end out being built around this particular "branch" of DIDs.
And so I've "widened my aperture" once more and now agree that including content-based verification methods as valid types of DID methods makes sense. Even if they explicitly do not involve any DID document.
So I urge not just @jandrieu but all members of the WG to take a close read of this thread and see if you agree. And if it would help, perhaps the proponents might host a special call or a webinar to explain their use cases in more depth as that would probably help too.
@jandrieu
In some cases, that material can be deterministically derived from the DID itself, like with did:key, in which case resolving the DID is how you transform the raw DID into the DID Document.
Following that statement I would say that did:r:<multihash> is same thing, I can generate DID Document out of it following defined rules which can give me the same sort of DID Document (if I am not mistaken valid did doc can include only id). But in many cases I just don't need DID Document which does not make the DID useless.
Many people think that DID points to DID document from where you can learn more about what you can do next. But as we know that is not true. DID points to the key and each method defines how to "construct" DID Document out of given DID. E.g Some ask you to look on the ledger to get the document, which is not different from having it in some sort of immutable DB. Others like did:key allows you to derive the DID Document out of the method-specific-id. If this is commonly accepted pattern did:r would follow same rules. Just in many cases in DID Document there would not be much meta data, in many cases it would be just empty. Or I could use
Current spec defines DID as:
Decentralized identifiers (DIDs) are a new type of identifier to provide verifiable, decentralized digital identity. These new identifiers are designed to enable the controller of a DID to prove control over it and to be implemented independently of any centralized registry, identity provider, or certificate authority.
This what I would like to see is something like this:
Decentralized identifiers (DIDs) are a new type of identifier to provide verifiable, decentralized digital identity. These new identifiers are designed to enable cryptographically-verifiable identification and to be implemented independently of any centralized registry, identity provider, or certificate authority.
Why?
The reason is simple so called DRI and DID have a lot of in common on high level they are the same thing. Getting DID spec generic enough is beneficial for everyone. This is what makes standards really powerful that they can unify parts without limiting the use of it to specific use case. In the current DID spec the specific use case is to use DID only for the purpose of controlling specific DID Document. And not even that as we already saw the movement that the "controlling" part is optional so you can have DID Document which cannot be altered after creating it. Seems that we are just on small step away from DRI in that situation.
@talltree gave a very good example how the story changed over time while thinking about what DID should/could be. In my opinion This shows how DID slowly getting mature and people realizing that the problem which they started with in first place can be apply to broader space.
As @dhh1128 already mentioned the reason why we even having that discussion is that DID seems to overcomes any other existing standards due to it's specific properties:
And the common denominator for all above is cryptographically-verifiable identification as @talltree pointed out. For me personally this gives very solid standard which if it would be adopted would give community a lot of benefits as your wallet/agent/system/website/you name it, implements one standard for cryptographically verifiable identifier and you can support people identity, things identity, content identity. Without saying how this identity is defined.
There was already a statement that DID allow to identify anything, yes but through DID Document which in many cases is just unnecessary step e.g. for identification of the content. Which could happen that reveals to much.
I also just read the thread, and even though I find the idea intriguing to broaden the concept of DIDs to all _"cryptographically-verifiable identification"_, without DID documents and without DID controllers, I still have a preference for @jandrieu's perspective, and for sticking with the current "party line".
Unfortunately, we have monopolized the term "Decentralized Identifier (DID)", when in fact there are other "decentralized identifiers" out there. But still I believe we should stick to the mental model that all DIDs are "created" and "controlled".
(Side note: in very practical terms, deviating from this may mean deviating from the DID WG charter - see the first few bullet points).
I would argue that there are existing and better "decentralized identifiers" than DIDs, which can already be used for identifying malware or elements in the periodic table. Those identifiers are URNs.
urn:sha1:xxxx for a long time. You could create a more modern version urn:multihash:xxx, and I think you got what this thread is all about? "Cryptographically-verifiable identifiers" that are "discovered" rather than "controlled"?urn:atom:xenon or urn:atom:54. Sorry, "URN namespace" doesn't sound as cool as "DID method" :(My concern is that if we don't introduce did:o:, we may alienate certain industry sectors which would damage the global appeal of DIDs. For example, the biggest Pharma DLT consortium project that we are in talks with at the moment are discussing using a platform architecture that doesn't use DIDs at all. If we had did:o: in the spec, they would definitely adopt DIDs and, with it, we could introduce the whole DID/VC flow into their architecture. Without that DID method, they have absolutely no need to delve into the world of DIDs, which compromises our overall desire to build a truly interoperable decentralized data economy. No pressure, huh @peacekeeper
@pknowl I understand that we may want to broaden the scope of DIDs as much as possible for marketing reasons, and I'm not strongly opposed to doing that. But from a purely technical perspective, why couldn't that consortium participate in a "truly interoperable decentralized data economy" with URNs, if they want identifiers that don't need controllers or DID documents?
@peacekeeper Should URNs be used as a universal standard for object identifiers? Surely, a DID should be able to cover all 3 valid identifier states.

In terms of key interdependency, we're already taking a hash of content in the "dependent" state. It sounds weird to disregard content when no entity identifier is referenced. It almost feels like we're turning our close cousins away for Thanksgiving dinner!
@pknowl — can you unpack your chart a little bit more for us? I.e., can you explain more fully:
What is the difference between an "entity identifier" and an "object identifier"?
I suspect this is all very clear in your head, but the chart is so terse that I suspect others will have a hard time grokking it without those explanations.
@talltree - No problem at all.
1.) "Key dependent" means that an identifier is governed by an entity and therefore a signing key is required. "Key interdependent" means that an object identifier can either be governed by an entity (whereby a signing key is required) or not governed (no keys required).
2.) "Trusted" means that an identifier is governed by an entity and therefore a signing key is required to establish trust. "Immutable" means that an identifier contains a hash of the content of an object which cannot be changed. If an object identifier is governed, the controller of the signing key has control over the content contained within the associated DID-document and, as such, it can no longer be deemed immutable.
3.) An "entity identifier" is an identifier that is governed by an entity who controls the signing key. An "object identifier" is an identifier that contains a hash of the content of an object.
@talltree Perhaps this network model will help visually.
network.pdf
You'll also notice that I've changed any reference of DRI to DOI and any reference of did:r: to did:o: in my previous entries. My apologies for that. I only cracked the code after already having joined the thread. Anyway, all corrections made. The kernels are now solid.
@pknowl Thank you very much—those terms are exactly the key I needed to unpack your table.
So one way to sum up what you are proposing is to unify the world of controlled DIDs with a new world of uncontrolled DIDs by bringing into the DID world the concept of a multihash identifier.
Now, I have another ask of you (if you are willing). One reason that I suspect for @peacekeeper 's hesitation about bringing the world of "uncontrolled DIDs" into scope is the question of resolution. Given that there is no DID document, have you thought through what a DID resolver would or should return when given a did:o: to resolve?
Could it, for example, return possible network locations of an instance of the identified object? Or any other useful information about the object?
@talltree There are mindset cabinets that will be unlocked here so I'll try to approach this topic from a place of practicality.
By the very nature of "key interdependency", in the case of an _object identifier_, some of the did:o: space is already being encroached upon. That is the first issue.
The second and much broader issue is the resolution process as a whole. The method name currently supports 52 different method types. That number will grow exponentially when the world of DIDs hits the masses. This issue can be resolved by moving any location information away from the method space. If that approach were adopted and did:e: and did:o: became the two permanent method names, entity identifiers and object identifiers could be treated autonomously which would resolve any fragmentation issues. _(... not to mention enable an interplanetary solution as Elon Musk endeavours to colonise Mars!)_
@mitfik and/or @ewelton can better explain what the resolution process might look like but I thought that I should bring up the elephant in the room as "if not now, when?"
From my perspective DIDs were at a crossroads around Sept/Oct 2019. At that time we had an opportunity to view DIDs as a sort of universal namespace with certain properties linking the identifier to an underlying asset using clever cryptography. This is what made them different from identifiers like UUIDs.
There were two other properties that were attractive:
resolve(did)I believe this conversation fits that time in the history of DIDs better than the DIDs of today - which are much more focused. In recognition of that focus, I actually favor @peacekeeper and @jandrieu 's sensibilities around focusing "what a DID is" - but I still think it is worthwhile to present the motivation behind the did:e and did:o concept.
The DID method space still does contain a number of "odd ducks" - like peer DIDs (which I think are critical, but which are definitely a different breed), and did:key or did:web, which strain the idea of "what a DID is" in a different direction. So even if the ideas are late to the party, nothing is cast in stone and it can't hurt to present the ideas.
The core idea is that there are three categories of things in the digital universe:
The distinction between 2 & 3 is captured well by the conceptual work around DIDs, with all sorts of wonderful nuances about key rotation and control and verification methods and the like being worked out. This model is still not well understood outside of the DID community, but I think that the DID community has the kinks pretty well ironed out.
DIDs bring a lot of other baggage to the table and is not clear whether that baggage is worth the trouble, since DIDs are not required for credential & trust technologies. You do not need DIDs for credentials or capabilities, and ubiquitous, low-effort, low-barrier to entry, existing network technology can achieve the same level of cryptographic integrity. So what is the attraction of DIDs?
For a while the attraction tto me was that it could create a unified namespace for the above three classes of "network things", with a "common API for resolution" at least as perceived by an application programmer. The programmer would just install the right package and call resolve(did) - and this would bypass any deep or intrinsic reliance on DNS. We finally had a simple, level playing field for asking a bootstrapping question about resources on "the network" - which included centralized, classic, decentralized, and P2P spaces.
DIDs had the possibility to be a generic identifier that could point to anything, support a simple API for resolution (not that the implementation of that API is simple), and open the door for the global community to expand both resolution and semantic and to develop tools, facilities, and services around a "new kind of content and integrity focused address"
The current DID spec is one species, one subset of that broader vision - and I think the steps taken in the process of moving to the working group effectively crystallized DIDs as that subset. This is not any kind of indictment or complaint - it is just that the broader community is catching up to what is going on with the DID authority and so some of these broader concepts keep coming to the party.
For a while we had a shot at an internet landscape where "the essential question" asked by applications was resolve(did) - perhaps rivaling fetch(url) or lookup(host) - and this would have been a glorious development. As it stands today, DIDs are suitable for a more specific range of use cases. They will operate in parallel with non-DID components of the identity, credential, and trust technology landscape.
The issue of resolution is critical - and the fact that it is not a centerpiece of the charter, is, I think, a mistake. There are tremendous challenges towards implementing resolution for the current, restricted DID model - and those challenges are driven by the large number of methods. The large number of methods fosters "wallet siloing", where application builders choose just a subset of methods and decide if they also support other URIs - or perhaps the culture will become dominated by remote universal resolvers, which, in the years to come, might be deployed as thickly as Akamai and Cloudfront cache servers.
It is too early to tell - and I am confident the community can solve the problems. The question that plagues me is not "can we solve it" but "what is the cost/benefit ratio" - and this rests on the scope of DIDs. For a broad enough scope, a high price is attractive - but when the scope is limited, the decision is less clear.
In many ways, the focusing of DIDs is a benefit to everyone - it will help ensure the spec gets out, is cleanly done, and is well managed and under the central authorities of the W3C, W3ID, and other registry, context, and guidance providers. It will ensure that people vested in global blockchains will be well supported. It will be the anchor in the sea of chaos surrounding DIDs.
The focusing of DIDs also opens the door to a new frontier seeking identifiers that can span DIDs + centralized resources + holographic resources + immutable resources (on various storage media, like IPFS, blockchains, etc.) - and that broader space of identifier technology will draw heavily from much of the excellent work that has been done on DIDs.
It is a win-win situation, no matter how this issue plays out - the only way to lose is to keep the issue open for too long, and the essential remaining question, for me at least, is resolution.
@pknowl
You'll also notice that I've changed any reference of DRI to DOI
I think you may have to change it again :) https://www.doi.org/ = "Digital Object Identifier"
@peacekeeper More reason to pull did:o: into the DID space!
Just wanted to hop in here and offer a point of view. While I believe that these CIDs are useful, I find that it's possible to represent the same data while keeping the mental model of a controller existing and abandoning the did document.
For example, let's say I had did:immutable:b85dca566725ca2d1baee467a13561af1346953a7bf281b1e259b172f5c740ab
(that's the sha256 hash of "I know this content")
and it's published to a registry with the following did document:
{
"@id": "did:immutable:b85dca566725ca2d1baee467a13561af1346953a7bf281b1e259b172f5c740ab"
}
then what I'm asserting as the publisher (and thereby controller) is that I know of some content capable of producing that hash, and I'm registering it for the world to know about. Furthermore, I'm saving myself the step of doing this and then revoking a key.
Put another way, I could achieve the exact same outcome by doing this with did:sov (with some crafty extensions)
did:sov:123456789abcdefghi =>
{
"@id": "did:sov:123456789abcdefghi",
"publicKey": [
{
"id": "did:example:123456789abcdefghi#keys-1",
"type": "Ed25519VerificationKey2018",
"controller": "did:example:pqrstuvwxyz0987654321",
"publicKeyBase58": "H3C2AVvLMv6gmMNam3uVAjZpfkcJCwDwnZn6z3wXmqPV"
}
],
"knownContent": ["b85dca566725ca2d1baee467a13561af1346953a7bf281b1e259b172f5c740ab"]
}
and then revoking the key such that the did document looks like this:
{
"@id": "did:sov:123456789abcdefghi",
"knownContent": ["b85dca566725ca2d1baee467a13561af1346953a7bf281b1e259b172f5c740ab"]
}
What I understand this proposal is suggesting though is that I should be able to develop a did method to make the identifier the content and call it a day. If you really wanted to, I could resolve it and get a did document as well, but it's pretty useless. _Albeit it's still a compliant did document._
In essence, what I'm suggesting is this identifier is cryptographic knowledge from the controller that they know the content that produces the hash. It still has a controller, who abandons control immediately, and are cryptographic mathematical guarantees asserted by the controller on initial generation of the did/did document.
@jandrieu what I understand you're suggesting is that the mental model should work in a way that's in-compliant with normative statements in the text currently. Specifically, the fact that every property of a DID Document other then @id MAY be used. Not MUST/SHOULD. If this is really the mental model we're going to role with, we should be changing a lot of the properties from MAYs to SHOULDs/MUSTs. Of particular note too, if we move to a SHOULD, I don't think that eliminates this usecase because it has "valid reasons in particular circumstances to ignore a particular item".
Even for fun, in the case of identifying when malware was first spotted in the wild, I could use the created property to have some metadata about the subject (specifically the first time it was spotted and registered).
e.g.
{
"@id": "did:immutable:b85dca566725ca2d1baee467a13561af1346953a7bf281b1e259b172f5c740ab",
"created": "2002-10-10T17:00:00Z"
}
Again, still a valid did document that complies with the mental model of a controller abandoning the did document.
Thanks, @kdenhartog - I appreciate you walking through that proposed solution for us.
@peacekeeper @jandrieu - As to not disrupt current resolution processes, perhaps the DIDWG will concede to allow did:immutable: to become a new method type for immutable content with DID-document abandoned. Your thoughts?
Tagging a few people who may have opinions about this suggested way forward:
@dlongley @msporny @burnburn @brentzundel @talltree @dhh1128 @ewelton @mitfik
@kdenhartog I really like your idea above - to me, the genius of having the @context flexibility was that it moved the MUST/SHOULD model around verification methods (and service_endpoints) outside of the core spec (making the core spec extremely light and focused).
The ability of the definition of the data representation of the cryptographic material required for credential and capability processing to exist outside of the DID-spec would create a soft-coupling between DIDs and non-DID URI ids. With a common spec, external to the DID-core, that defined fields like capabilityDelegation and keyAgreement you have broad interoperability, not just internal to the DID universe but also with the world of credentials and capabilities in the large scale efforts actively being built outside of this community.
When viewed in that light, having the DID-spec be very crisp and lightweight, and simply identifying the relationship with some key-specs using an @context (or @context-like) switch - the door is open to easily defining semantics for 'found objects' that clearly express the appropriate model. You could have a 'discoverer' field for example.
You would not want to support 'mutable' data for such an object - definitely not service endpoints, nor cryptographic control material. But it still buys you a lot - having the ability to have a consistent, non-controller scoped name for an immutable digital asset, like a virus footprint would allow malware experts (organizations and individuals) to reliably associate information about how to fix, other ways to detect, and any other manner of derived data about the malware.
Issuers could link information directly about "malware footprint X" and not "controller Y's name for malware footprint X"- currently they could do that with urn:multihash:footprint but that just begs the question of resolution and metadata - where do you get that little hint that "this is a virus footprint, discovered by Y on DATE"? It is possible, but it strengthens the growing sense that DIDs are a fringe element in the world of credentials and capabilities.
In the world where DIDs remain central to credentials and capabilities, the ability to register "found objects" like virus footprints would open the world to subscribing - not to virus detection updates from specific companies based on SSL certificates at the end of https://malwarehelp.co/updates, but to a broader field of expertise (and perhaps more rapid response). You could choose to trust specific sources, and you could accept updates to your anti-malware software based on trust in issuers rather than trust that the SSL certificate and subscriptions.
The key is in the ability to issue a VC about "malware footprint X" instead of "controller Y's name for malware footprint X" - it takes the unwanted controller out of the loop, and does not introduce the need for a centralized malware database at the end of some DNS-mediated URL. This is a step towards Doc Searls' intention economy, and away from the customer-capture-and-control model that dominates the malware industry. It also helps reduce the inherent surveillance in pinging the update server.
The only thing that bothers me about the approach you describe above is the "surrender of control" being in good faith - unless I read that wrong or am lagging in my tracking the current edge of DID thinking. It is an area where KERI, and the transparency that comes with witness records, could record the completion of "surrender of control" - so you could work around the limitations of the DID spec and identify "immutable objects" - being those for which one could document the surrender of control. It would make the abstract data model and fixed semantics of the current DID-core more complicated, but that's what standard processing libraries are for.
One final thought about the benefits of prying the cryptographic material and service endpoints out of the DID-core spec, and into associated specs is that it would help address the tension of having resolution out of scope while the service_endpoints definition and URL discussions are in-scope.
With that lightening of the DID-core burden you get a very nice compartmentalization of concerns, and some excellent linkage with the broader world of identity, credential, and trust-technologies - the ones that will be in play when DIDs mature to the point of broad adoption.
But again, there is tremendous progress being made in the current DID-spec model - and we must take care not to derail that, even if it means that DIDs become a niche solution in the broader universe of URI mediated credential and capability processing.
Wow, I must admit I am coming to this even later than @talltree and am astounded at the length and depth as well.
I'd like to make some comments at a meta level.
One of the prime killers of standards is scope creep. As one of the chairs, my job is to avoid that where possible. The challenge is that as others become aware of and join the work, it is natural for them to see ways in which they could make "just a small change" that would enable other use cases.
@jandrieu and @peacekeeper are correct in their description of the original intent of DIDs, the intent that spawned the incubation work and then the formal standardization track we are on. (And sorry, @talltree, but the notion of a DID document as something presented/generated rather than (necessarily) stored has existed since BTCR, the very first DID method)
@ewelton is correct that the latest reasonable time for a major perspective shift was when this group kicked off last September. Actually, even that was too late - charter development was the time.
However, I like what I am now seeing - an attempt to describe how to accomplish at least some of the "new" use cases presented in this thread via mechanisms already existing in and envisioned by the spec. This is how standards succeed, by accepting a functional if non-ideal use of the original idea. Remember, we can always do a 2.0 or create a completely different standard after learning through use of this one. In the mean time, an imperfect or limited standard that is COMPLETED is vastly superior to one that undergoes a revamp every year when new perspectives are added.
Soooo, I am not expressing an opinion, at all, on the value of the alternate perspective from @dhh1128 and @pknowl or the specific proposal from @kdenhartog . I am merely suggesting that creative directions like that suggested by @kdenhartog are typically more successful ways to get a new, potentially larger raft of use cases addressed because they dramatically limit scope creep and thus may allow us to finish this within a reasonable time span.
So keep talking!
Thanks, @burnburn . Your post adds a wonderful sense of calm to this thread and gives us the necessary movement and oxygen to continue the discussion. At this stage, the argument is more philosophical than technological.
I strongly believe that a decentralized data network will be much more stable if we only have one identifier throughout the space - _DIDs for everything_. That would allow us to put all other identifier standards out to pasture with a graceful "thank you for getting us to this point."
I'm attaching a mini-deck so that everyone can visualise exactly what that means. _DIDs for everything in the big blue circle_. See first slide.
Identifiers.pdf
@kdenhartog Hats off to you for a wonderfully simple twist to thinking about what @dhh1128 and others on this thread have been proposing. The way you describe it, did:immutable: actually strikes me to be much more like did:key, i.e., the contents of the DID document are cryptographically related to the DID itself.
I like it. In fact I like it so much—and it fits within the existing spec so easily—that I propose that one of the proponents write a PR.
@dhh1128 Thanks for instigating a vibrant and much needed discussion.
@kdenhartog Thanks for coming up with a simple solution for us to get our teeth into.
@mitfik and I will start writing the did:immutable: method. Kyle has kindly offered to assist but would rather not be the sole maintainer (which makes total sense as this method really falls on the _Decentralized Semantics_ side of the model). He has suggested that did:key: would be a great basic template to work from and that it might be worth getting the Protocol Labs folks (creators of _multihash_) to join in too.
We'll get something down in writing asap for review.
Thanks, everyone.
We already have:
urn:sha1:xxxx)ipfs://xxxx)hash://sha256/xxxx)ni:///sha-256;xxxx)The last item in that list says in the Abstract: _This document defines a set of ways to identify a thing (a digital object in this case) using the output from a hash function._
I agree that it would be possible to come up with DID methods that define the CREATE and READ operations in very creative ways, and by all means, go ahead. I'm just not sure what the use of DIDs really adds, especially if you don't need controllers and DID documents. You can achieve everything with plain URNs or other identifiers that are much simpler than DIDs, while still being interoperable (since everything is a URI).
BTW as a side note, if I understood @dhh1128 's original argument correctly, I think this thread is not only about DIDs that are cryptographically derived from the subject, but also - more broadly - about DIDs that are objectively observable from the subject in other ways, e.g. atomic numbers of elements in the periodic table. I look forward to reading the DID method specifications and how they define the CREATE operations. How are the DIDs for elements in the period table created? Should we make CREATE optional and argue that some DIDs may just "have always existed and are discovered rather than created"? Or did God invoke the CREATE operation for those DIDs? How "decentralized" is that? :)
The more I think about this, the less concerned I am, so please feel free to go ahead!
Just one request regarding did:immutable: I think the method name is not ideal, since it may be misunderstood to mean that these DIDs always refer to the same subject and cannot be reassigned to other subjects. But this is actually a property of all DIDs.
@peacekeeper I do not think that "elements" work, as they are not immutable digital assets, but rather are "objects in the physical world". I believe that mapping those objects makes sense to always have a point of reference - e.g. a custodial owner.
In other words NIST and CERN might define periodic tables of elements - but there is no way for them to agree upon using an inviolate identifier, although they can provide some "claim" that says "NIST claims that what NIST calls Vibranium is the same thing as what CERN calls Vibranium" - as in
did:<nist> asserts that did:<vibranium-1>, controlled by did:<nist>, is 'the same thing as' did:<vibranium-2>, controlled by did:<cern>
this is a claim that can be refuted when it turns out that their values for the Atomic Weight differ by 0.001 units, and therefore may actually be two independent things, or it can be held "as equivalent" in so far as it is useful.
The downside is that there is no way to talk about "Vibranium" without positing a semantic frame of reference. It may seem that the word works, but "Vibranium" is not deterministically resolvable into anything. I think this is a fundamental property of linking the digital and physical world. Truly immutable objects only exist in the digital world - and when they cease to exist they do so wholesale, not through decay and mutation, which changes the hash.
So I believe that there is a deep requirement that the numeric component is derivable from the content, and as you indicate, there are several options available when dealing with these creatures. There is no compelling reason to push them into DIDs-of-today. However, a while back there was a conflict-free pathway to including this sort of uniquely digital object under the umbrella of did resolution. This was something that is not possible with URNs, but is possible with some other frameworks on the list. Thus, if this were this a year earlier, we could reasonably have talked about including a common infrastructure for resolution which would unify the multiple strategies you list, such that resolve(did) was all you ever did - hiding the complexity of 'method of access' along the same lines as we do with did methods today.
However - I absolutely agree that the window for capturing these items in this version of the W3C DID spec has passed - except through some strategy like @kdenhartog suggested, which is interesting in so far as it plays within the rules defined by the central DID authority. ;)
When it comes to a name - I definitely have no strong love of the term immutable - but I am not sure what the correct name is because there is a risk of conveying the idea that they could refer to physical objects - like elements in the periodic table, or abstractions like center of mass, or even numbers - and that would be incorrect. Only objects for which a hash (e.g. urn:multihash:
It is this property which can not hold when talking about the physical world, and requires the cryptographic binding and proof of control that DIDs establish.
@peacekeeper @ewelton This method is specific to a _non-governed immutable object_. Does that trigger a method name that you would be happy with? Open to suggestions but we're confined to those three words.
Let me start off by saying that I don’t think the purpose of DIDs is to encroach on every other identifiers. I think there’s clearly understood concepts that we have no normative statements to enforce these concepts with and that’s what my proposal intended to show, but I didn’t do a great job conveying.
For example, I believe that did:atom:carbon is valid today as I read the spec even though I don’t think it should be. Looking at some of the rules around method specific identifier generation looks to set some mental model assumptions, but doesn’t go so far as to explicitly say this is not possible. I believe this should be changed. I think if we’re changing normative statements for this method, it should be to further constrain what method specific identifiers can be right now so that human friendly identifiers are less likely to occur. Not trying to create an all encompassing identifier because as @burnburn pointed out this is going to make it much harder to get the spec across the line. How we make that testable is going to be the hard part and that’s what I think will take philosophical questions of the abstract into concrete text that strengthens the purpose of dids.
The reason I believe we should be doing this is because I think this will be what brings teeth to the claim “decentralized”.
Additionally I think the purpose of this did method should be used to further explore and show what’s the difference between DIDs and URNs. One of the main aspects in my mind that differentiates the two would be resolvability of metadata.
Where the metadata discussion goes and the normative statements that come out of it could be another place we limit this method. If our metadata approach ends up very extensible, then that will enable more functionality in this did method. On the other hand, if we decide to heavily restrict metadata in the did document as @jandrieu has suggested, I think what I originally described will be the extent of what this did method can cover.
So to make my position clear on this, I find the idea of cryptographically bound content identifiers fitting within this spec either way. I think it fits clearly in our mental model as it stands today. What I disagree with is that we should be creating an all encompassing identifier that can do anything and everything, but as it stands today the spec allows me to do that. If others disagree with that as well we need to start adding normative statements to enforce this now. The best way to do that will be to push the boundaries with conversations like this AND with concrete proposals for changes to the spec. Otherwise it’s inevitable for someone to come along in five years and define did:atom because of some extraneous reason they decided dids were the best way to do this.
The maximum number of letters in any other DID method on the registry is 7. On that basis, I propose that we shave off a couple of letters. I think did:immutab: works perfectly well both in meaning and visually.
The maximum number of letters in any other DID method on the registry is 7. On that basis, I propose that we shave off a couple of letters. I think
did:immutabworks perfectly well both in meaning and visually.
I'd prefer something more along the lines of did:cid or did:hash personally. I think it doesn't cause the confusion that Markus highlighted above around the subject being mutable. However, I don't think this would support update functionality because the controller abandons control at the time of publishing if we went in the direction of my proposal.
I like did:hash:. I think we should avoid any mention of id in the method space as that is already explicit in did:.
I really like did:hash:
Please, no... imprecise and do you see what Github turns it into, above. :)
Name it something more precise... like did:cid (content identifier) as @kdenhartog mentioned above, or did:chash (content hash)
did:chash: (content) or did:ohash: (object) ?
Going back to the original definition ...
An "object identifier" is an identifier that contains a hash of the content of an object.
In that definition, "hash of the content" is a function and the "object" is a target.
Does the DID method type usually depict a function or a target? That could help steer the final naming decision.
@ewelton may have an opinion here also. He is an expert in physical / digital convergence.

@pknowl Actually - I'm not so sure we need to head out and name this method quite yet. This has been an interesting exercise, but I am not quite sure that a method is ready for prime time.
I still am unclear about two things:
I would like to see some clear use cases of these as dids, instead of using the alternatives such as those mentioned by @peacekeeper, which are adequate to the task at hand.
I also want to be clear that the intention was never to simply replace existing identifier schemes for the sake of replacement. The appeal of content-bound identifiers made sense in pre-WG DID days as part of a larger effort to support verifiable identifiers and
If there is some clarity on how this method would work in terms of resolution, and how we could guarantee that the registered object fingerprint is no longer under the control of anyone (which may not be possible using DIDs) then a method is warranted and could be valuable. Once those are answered, a name might suggest itself, and without answers to those questions I would suggest not making a DID method.
The outcome we want to avoid is adding a method to the pile that looks like it does one thing, but in fact does another. If all we do is create a "regular DID" with an additional field, that is insufficient. The method space is too large relative to what DIDs are today, and we should resist adding to that pile without a strong value proposition.
Thanks, @ewelton . That is a sound argument but, going back to my original argument of _DIDs for everything_ in a decentralised network which allows us to move into a synergistic future with better naming conventions and smarter identifiers, I'm keen to keep investigating.
@kdenhartog - Are you able to answer Eric's first question ...
1.) How is control surrender enforced?
@mitfik - Are you able to answer Eric's second question ...
2.) How does resolution work (e.g. what is the relationship w/ an underlying registry)?
Let's hammer that out before coming back to method naming.
Just so I don't have to scroll back later on, can someone also give me a definitive answer on whether a DID method type should depict a function or a target? Thanks.
We've said a lot of words here. I have tried to keep this brief (and failed). HOWEVER, I am responding with a different illustration of what I see as the defining mismatch between content-based identifiers and DIDs.
This thread has shifted my sense of how we communicate what a DID is. Regardless of whether was adopt this new kind of DID as something we, as a standards effort want to incorporate, we should definitely update the language in the spec so the mismatch can be minimized for future readers. People have a hard time understanding how DIDs do what they do, which is vital to understand if they are appropriate for a given reader's needs. However the technical questions resolve, we definitely have a documentation problem.
Here's what clicked for me as I was trying to understand how we are talking past each other.
DIDs are a framework for cryptographically proving control over an identifier without relying on a trusted third party.
This is what's new. This is what's different.
This proposal to "nuance" our mental model abandons that and would create a new class of DID which is essentially uninteroperable with other DIDs. I'll call these CIDs for content identifiers, which have all the characteristics described by others. As I've stated many times, they sound awesome. They will be useful. It makes sense to standardize a way to use them.
Consider the use cases document:
https://w3c.github.io/did-use-cases/
First, two of the first four essential characteristics of DIDs are not met by CIDs:
- crytopgraphically verifiable: it should be possible to prove control of the identifier cryptographically;
- resolvable: it should be possible to discover metadata about the identifier.
Maybe I'm missing something on #4, but to my understanding, revealed knowledge cannot establish control in the way that secret knowledge can. If you must reveal the knowledge to satisfy the cryptography, as you do with hashes, you cannot prove anything cryptographically without ceding equivalent control to the recipient of the proof. It's a leaks control and therefore isn't suitable as a control framework.
Second, of the 13 actions enabled by DIDs, only the first two are supported by CIDs:
3.1 Create
3.2 Present
3.3 Authenticate
3.4 Sign
3.5 Resolve
3.6 Dereference
3.7 Verify Signature
3.8 Rotate
3.9 Modify Service Endpoint
3.10 Forward / Migrate
3.11 Recover
3.12 Audit
3.13 Deactivate
CIDs can't be used to Authenticate, Sign, Resolve, Dereference, Verify Signature, Rotate, Modify Service Endpoint, Forward / Migrate, Recover, Audit, or Deactivate.
Third, the reason DIDs are useful in decentralized identity is precisely because of the ability to demonstrate control. Not because they identify only a particular class of thing or because they can disambiguate anything.
(FWIW, even @dhh's second definition of disambiguate wrt Alice's definition is unknowable and unprovable. Because people other than Alice can use the DID as a subject without getting confirmation from Alice that they are using it in the way that she means it. And even if they did, there is still the risk of semantic drift as Alice's sense of what she means evolves over time.)
The way DIDs bootstrap digital identity, in the most typical use case where Subject==Holder==Controller (whether or not the issuer is identified by DID) is as follows:
Two stages.
First, you get the credential.
Stage I
Second, you use the credential.
Stage II
At this point, the Verifier knows that the current presenter of the VC has proven control over the same secret information as the subject, and therefore, with a specific level of assurance they can accept that the current presenter is one of the following:
We always have to allow for #3. That's the weakness in the system. However, the entirety of modern cryptography has this weakness, which is why keys MUST be kept secret if they are to have any use whatsoever.
It is the ability to perform this proof of control that ties the issuance of a VC to its presentation so that a Verifier can have some proof that the party presenting the credential is, in fact, the entity given that credential, which to the best knowledge of the issuer was believed to be the subject of that credential.
You could, of course, use a third party to demonstrate proof of control. You just ask Facebook who they believe is the current presenter. They'll use their own authentication approach then present their result. The whole point of DIDs is to enable this sort of bootstrapping of verifiability WITHOUT relying on the likes of Facebook. That's what makes DIDs unique and valuable.
CIDs can't be used in this fashion. As such, they just don't do--CAN'T DO--the fundamental thing that DIDs were created to do.
Yes, we can attempt to interpret the "decentralized" part of the DID name in the hope of supporting all the kinds of identifiers that can be rigorously created without a trusted third party, but, when we can't even agree on the meaning of the word "decentralized", that seems like a particular kind of madness. No offense to @dhh1128 @pknowl @ewelton or any other proponents of this idea. It's just that shoehorning an incompatible, non-interoparable notion of DIDs because of lexical similarity with an ill-defined term just doesn't stack up for me.
That said, I do like CIDs. They have been implemented as URNs in several forms from urn:hash to urn:sha. The particular variation proposed here might deserve its own namespace, such as urn:cid or perhaps if it builds on multihash, urn:multihash.
However, since
I can't help but come to the conclusion that CIDs are not DIDs.
If it doesn't look like a duck and doesn't quack like a duck, it's probably not a duck.
It might be a bird. It might taste delightful when prepared in the Peking style, but it still probably isn't a duck.
The
did:o:identifiers would not sit in any identity registries.
There is some precedence for this. The DNS RFCs specifically exclude the .onion root domain (and a few others) from fully complying with the DNS standard. See specifically https://tools.ietf.org/html/rfc7686
-- Christopher Allen
I'm sorry, is the proposal here to have a did:o namespace that then has multiple methods underneath it?
For example
did:o:sha:123...
did:o:multihash:abc...
did:o:myHash:xyz
Is that was you're suggesting @pknowl?
@jandrieu I want to clarify - I am not a proponent of adding content based identifiers into the current model of DIDs. This is because of the two reasons I enumerated - lack of solution to resolution, and no way to fully "surrender control" - and "reproducing" simple urns but calling them DIDs is silly - and besides did:o:sha:123 doesn't assist resolution at all, because it is missing location information.
One of the mistakes made in the DID model is the strange handling of resolution - DIDs contain some location information but rely on a bunch of secret hidden magic to make them resolvable. Resolution is critical, and leaving it out of scope is just part of what I consider "a long series of mistakes" beginning around mid 2019.
Current DIDs have become defined the way you define them as the result of evolution of the community. DIDs were more open to flexibility and interpretation in the past. Alternative approaches to DIDs lost out in the sea of privacy, control, and decentralization voices - and that is fine. The rubric idea became myopically focused on decentralization, so we lost most of the structure for navigating the alternatives. The use cases became focused on what I consider a niche world. The collapse of semantic flexibility meant we got onto the road of "the one true DID"
So, to be clear - I believe that there are legitimate use cases for these sorts of "non-controlled" and "verifiable" content-based identifiers. And I believe that 1 year ago would have been a great time to sweep them into DIDland so that we could build them into the resolution infrastructure. And I believe that the flexible semantics we had 1 year ago gave a very clean path to model this larger landscape inclusively and to the benefit of the global community.
However, as of today, DIDs are more focused - they are much more specific thing, and that means that a spec will be produced and we'll get some nifty tools out. It also means that I think that getting these sorts of capabilities into the DID landscape, for the goals @pknowl identifiers, might not be viable today - the window has closed and it is time to work with the DIDs we have, not the DIDs we want. Maybe there is a way to shoehorn them into the authoritative model of DIDness, but it will take a cleverer person than me to do it.
Don't get me wrong - there has been a lot of great work and thought behind DIDs-of-today - but DIDs are neither revealed truth nor natural law, they are the result of a negotiated specification that reflects the loudest and most energetic voices. Since those have focused on privacy paternalism, control, anti-correlation, and a particular interpretation of decentralization - that is what we have. I am excited to see a lot of the work that is going on, but these DIDs are just not that relevant to my use cases - there are alternatives which I can use today to deliver "improved-sovereignty" and "improved government and business processes" through the use of non-DID grounded credentials and capabilities. When DIDs are mature and in broad adoption, it will be easy to incorporate them into my world and further improve sovereignty - and I am looking forward to that.
What makes DIDs strong for some people, make them weak for others - and that is normal. What is most important is that the spec stabilizes and is released. There is always room for adaptation in the next round of specs, and via alternative specs - so I support this effort to the extent that it does not derail or retard the delivery of a clear specification - whatever it winds up saying.
Many thanks for pointing me to that link, @ChristopherA . Very much appreciated.
@jandrieu - For our purposes, we're not interested in location, we just need to know that the content is immutable. Perhaps resolution characteristics and MIME-type would be held in the associated DID document. I would expect the did:o: namespace to be very simple ...
did:o:<hashofcontent>
For example, if a non-governed object were moved from Drive A to Drive B, the identifier should remain the same even though the location has changed.
@mitfik will certainly have some deeper insight into requirements and resolution.
@ewelton - I'm also acutely aware that if we get the naming convention right at this stage for non-governed objects, the Semantics side of the model would remain stable despite the release of future versions of the DID specification. This is just as much about sustainability to the network going forward as it is to non-governed objects requiring a stable identifier under the DID umbrella.
Actually, the precedence of allowing for some “special purpose domains” that do not need to fully adhere to the DNS RFCs is described more fully in Section 3 of RFC 6761.
https://tools.ietf.org/html/rfc6761#section-3
The .onion domain RFC https://tools.ietf.org/html/rfc7686 describes more why this top level domain meets the
criteria.
I’d like to suggest that we support a similar carve out (like in RFC 6761) for how to register a “special purpose method”, but specifically do not add to our agenda to tackle specifying the nature of any such method.
This allows the did:o, etc. people to proceed with their ideas, and allows others others who do not meet the full criteria of the 1.0 standard to still be able experiment.
For could begin with registering those method that don’t support full CRUD by marking them as “special purpose method” in the registry, and the method only has to show why they qualify as such a method.
— Christopher Allen
@ChristopherA That does seem like a particularly useful way of sorting out some of the "stranger" methods, and perhaps keeping the door open a crack for at least playing around with novel ideas. If some of those ideas catch hold, they could make it into an future version of the spec itself - but they do not have to challenge the progress achieved by focusing DIDs, and they do not need to distract by requiring additions to the use cases.
+1 !
@kdenhartog
For example, I believe that
did:atom:carbonis valid today [..]
I agree with your comment.
Just wanted to point out that there's an interesting difference between did:atom:carbon and did:atom:6. In the second example ("6"), the identifier is an _"intrinsic property that is objectively observable"_ (quoting @dhh1128 here), whereas in the first example ("carbon"), that is not the case.
I've gone quiet on this long thread that I started, but I wanted to say thank you to all the smart people who chimed in.
Re. the final pair of comments from @kdenhartog and @peacekeeper : yes to the distinction Markus was trying to highlight. When you have a property that is objectively observable as the basis of an identifier, and everybody knows what property to look for, then you have the interesting phenomenon that multiple observers will automatically be led to agree on the identifier for the object -- even for new objects not yet discovered. This has some very desirable benefits in a decentralized ecosystem. Perhaps Joe is right that this doesn't belong inside the DID umbrella; I'm content to let consensus rule, but just wanted to make the strongest case I could for it.
As the original opener of the issue, I am happy enough with the ensuing discussion to let it be closed now. But we can also keep it open longer if procedure or the preferences of others pushes us that way.
I think for those who would like to update the mental model in ways that have been discussed in this thread, a concrete next step would be to:
@peacekeeper A DID using this method-to-be-named would still have a definition of the Create operation, no? It's just that the Create operation in the DID method spec would describe the special way in which DIDs using this method are created.
RE naming, I thought the original proposal was for DIDs using this method to use the multihash format. If so, why not just call it did:multihash:.
@talltree I'm keen to name this method type did:o:, a name that can be cast in stone unhindered by future revisions to DID specifications and methodology. An "object is an object" so why not be bold from the outset.
The other argument for sticking with the "O" method type is that there will be a huge number of these identifiers woven into the fabric of the decentralized network. 50% of all identifiers (i.e. anything non-governed within the _data capture_ side of the model) will contain this method type. To help people digest, adopt and ultimately scale this new identifier type, users could simply refer to them as "DID-Os".
+1 to did:multihash over both did:immutable and did:o. The method name should be a hint to how the DIDs are created and resolved, rather than indicating what is being identified.
I think this is another interesting aspect in this thread. Almost all DID methods I am aware of don't restrict what is being identified. This one seems to have such a restriction, i.e. it can only identify what can be hashed.
@peacekeeper I suppose the method name should reflect how the community sees the DID space evolving. I, for one, hope that the argument for the development of did:e: (entity identifiers) and did:o: (object identifiers) will be supported by the DIDWG in the future. I'm not saying we need to get there tomorrow but, now that a light has been shone, it will be difficult to ignore.
We have a rare opportunity to name the object identifier correctly right off the bat whilst hinting at an elegant DID syntax evolution for the future. Why wait for governed identifiers to align to the methodology. If the identifier name is set to did:multihash:, it will inevitably have to be renamed to did:o: in the future.
If I'm missing something and did:multihash: will simply be easier to get over the line for DID v1.0 then I'll concede for the greater good but that shouldn't stop the DIDWG from investigating did:e:/did:o: further upstream in a bid to resolve the potential method-type scaling issue highlighted in this thread.
@mitfik has just messaged me saying that he has a feeling that a non-governed object identifier may need to contain more than just a simple 'multihash'. On that note, I propose that the community hold off on a casting vote until the tech guys have had a chance to further investigate what identifier characteristics should be included.
@peacekeeper
I think this is another interesting aspect in this thread. Almost all DID methods I am aware of don't restrict what is being identified. This one seems to have such a restriction, i.e. it can only identify what can be hashed.
This is critical as I see it, because it is the presence of a controller that defines the semantic space within which the identified exists. I see that as a key strength of controlled DIDs. When you and I talk about the same thing using different DIDs, the only way that can coordinate is by presenting evidence from attached and found information - external claims, credentials, and the like which are linked to the controlled document. That is very valuable, however....
The reason these were of interest was that, like urn:multihash:1234 there is a restrction on what is identified - namely that which can be hashed. It is this property that allows them nearly zero semantic ambiguity - down around 1 in 2^80 or above range - tweakable by the hash, of course. This means that we can talk about the same thing, using an identifier, without pinning it on a negotiation.
This is useful, for example, when pointing to a credential schema or context or other primitive from which one scaffolds deterministic processing in a decentralized data economy - it provides an "open authority" without simply using DIDs to create "a new root of central authority." I find the concept of a Bitcoin Anchored Semantic every bit as Centrally Controlled as schema.org.
Hashlinks give us a lot of the power needed - and in particular they give us the thing that is missing from simply using did:whatnot:<hash> - namely, hints about location and thus a pathway to resolution. What nothing gives us yet is a specification about what sort of descriptor could come back, and that definitely has value - giving programmers a coordination point that was not bound to specific implementations, but bound to the concept of uncontrolled, self-certifying identifiers.
I also remain concerned about the maintenance of hidden control - the 'create' method would effectively be a 'register' method - but register it in what infrastructure? - which gets, again, to resolution. And it is the infrastructure of the registry which defines the possibility of true "surrender of control" vs. "good samaritan waiving" - i think it makes sense to wait to name this concept until those elements are clear:
if we can not do these, then we have defined something equivalent to regular DIDs with a claim "this DID that I control is about urn:multihash:1234" - and those DIDs are fine, but they can not be the foundation for scaffolding semantic processing on a decentralized data economy - for that we need a decentralized identifier with broader capabilities than DIDs.
I think for those who would like to update the mental model in ways that have been discussed in this thread, a concrete next step would be to:
* Propose that the "create" operation be made optional, just like a while ago we made "update" and "deactivate" optional, OR: * Demonstrate in some draft version of a DID method spec how the "create" operation would be defined.
I'd say there's probably a few things we could take from this thread as well to make as additions to the did core spec. Some of the arguments against this method have pointed to a few things that are left as tribal knowledge that I'm wondering if we could get normative, testable statements for.
For example, one of @jandrieu point I felt was a pretty strong point. On creation of a DID it SHOULD (could be upgraded to MUST) be possible to prove limited control of the identifier via a cryptographic mechanism.
Another one I've been toying around with is the idea of a minimum number of possible namespace entries. E.g. the method specific identifier must be able to identify at least 2^80 unique identifiers. I'm not sure this really adds much enforcement to the idea of the identifier not needing an authority to authorize access to the namespace.
I also like @ewelton point about adding at least non-normative statements and normative statements if possible around surrendering control because I feel that was part of the crux of what makes this possible.
@peacekeeper do you have any ideas around other things that might be worth adding for this?
Thanks, @ewelton . That is a sound argument but, going back to my original argument of _DIDs for everything_ in a decentralised network which allows us to move into a synergistic future with better naming conventions and smarter identifiers, I'm keen to keep investigating.
@kdenhartog - Are you able to answer Eric's first question ...
1.) How is control surrender enforced?
It's surrender at the point of creation by the intrinsic nature of the method. In other words, control of the knowledge is all that's necessary to create the method. Representation and proof of control is unnecessary after creation, just as it's unnecessary after all keys have been revoked in all other methods.
I'm sorry, is the proposal here to have a
did:onamespace that then has multiple methods underneath it?For example
did:o:sha:123... did:o:multihash:abc... did:o:myHash:xyzIs that was you're suggesting @pknowl?
I hope not, that makes the method name even more likely to centralize around a naming authority.
I've gone quiet on this long thread that I started, but I wanted to say thank you to all the smart people who chimed in.
Re. the final pair of comments from @kdenhartog and @peacekeeper : yes to the distinction Markus was trying to highlight. When you have a property that is objectively observable as the basis of an identifier, and everybody knows what property to look for, then you have the interesting phenomenon that multiple observers will automatically be led to agree on the identifier for the object -- _even for new objects not yet discovered_. This has some very desirable benefits in a decentralized ecosystem. Perhaps Joe is right that this doesn't belong inside the DID umbrella; I'm content to let consensus rule, but just wanted to make the strongest case I could for it.
As the original opener of the issue, I am happy enough with the ensuing discussion to let it be closed now. But we can also keep it open longer if procedure or the preferences of others pushes us that way.
It looks like the author of this issue feels satisfied by the discussion that occurred. Next steps for this can go one of two ways (potentially both) I would guess. @mitfik @pknowl and I can draft a strawman did method to explore what these immutable, surrender control on creation dids would look like, or we can begin to propose language to constrain what did methods are possible.
Any opinions on which way to go?
Thanks, @kdenhartog . I believe this is now in the capable hands of @mitfik and a couple others in the HCF tech group to start working on a strawman/draft spec. The workload has suddenly gone through the roof at this end which is why this stream has slowed down. That said, I think we have everything we need for now.
I propose we close this issue then since the did method can be shared via the did method registry. Any objections?
No activity since marked pending close, closing.
Most helpful comment
Wow, I must admit I am coming to this even later than @talltree and am astounded at the length and depth as well.
I'd like to make some comments at a meta level.
One of the prime killers of standards is scope creep. As one of the chairs, my job is to avoid that where possible. The challenge is that as others become aware of and join the work, it is natural for them to see ways in which they could make "just a small change" that would enable other use cases.
@jandrieu and @peacekeeper are correct in their description of the original intent of DIDs, the intent that spawned the incubation work and then the formal standardization track we are on. (And sorry, @talltree, but the notion of a DID document as something presented/generated rather than (necessarily) stored has existed since BTCR, the very first DID method)
@ewelton is correct that the latest reasonable time for a major perspective shift was when this group kicked off last September. Actually, even that was too late - charter development was the time.
However, I like what I am now seeing - an attempt to describe how to accomplish at least some of the "new" use cases presented in this thread via mechanisms already existing in and envisioned by the spec. This is how standards succeed, by accepting a functional if non-ideal use of the original idea. Remember, we can always do a 2.0 or create a completely different standard after learning through use of this one. In the mean time, an imperfect or limited standard that is COMPLETED is vastly superior to one that undergoes a revamp every year when new perspectives are added.
Soooo, I am not expressing an opinion, at all, on the value of the alternate perspective from @dhh1128 and @pknowl or the specific proposal from @kdenhartog . I am merely suggesting that creative directions like that suggested by @kdenhartog are typically more successful ways to get a new, potentially larger raft of use cases addressed because they dramatically limit scope creep and thus may allow us to finish this within a reasonable time span.
So keep talking!