Tagging @ewelton and @mitfik and @swcurran

dhh1128 on 18 Mar 2020

This would impact issue #122 as well - in fact, it would force the selection of 1b and 2a.

ewelton on 18 Mar 2020

There is always a controller. So 1b doesn't work.

At the bare minimum, whoever created the DID is the controller. It does not imply they are, in any way, a controller of the DID Subject. All it means is that they controlled the initial DID Document, and presumably--depending on the method--retain the ability to further modify the document.

In the malware use case, I believe a better way to model that is that the initial reporter generates a DID and issues a credential saying that malware with a given hash has been given such and such a DID, perhaps with other corroborating claims in that credential. No control relationship needs to be established between the DID Controller and the DID Subject. But no matter what the relationship between Subject and Controller are, there is ALWAYS a Controller, whether or not there is a controller property.

Alternatively, the malware discoverer could just issue a credential with that hash, without using DIDs. I'm not sure DIDs buy the use case much.

jandrieu on 18 Mar 2020

@jandrieu : Let me challenge "there is always a controller" a slightly different way. If a DID doc is created (said differently, if any metadata is associated with a DID), then I agree that there is always a controller, at least at the outset. Whoever creates the DID doc (chooses the metadata) is the controller. They can make decisions about whether to continue being the controller, or to disclaim control over the content by removing control methods. Furthermore, I agree that all our thinking up until now has assumed that DIDs and DID docs are inseparable concepts, because <assumption>of course we want metadata</assumption>.

But what I'm suggesting is a scenario where an identifier needs to be created (perhaps better said, it gets discovered) without a DID doc--zero metadata--from the get-go. Nobody is allowed to define any metadata about the identifier -- even at the outset. The need here is a pure identifier that has the decentralization characteristic of DIDs, but not the resolution characteristic. It's almost like a hashlink, except it makes no claim about location (or any other properties), only about existence. What is known about malware (metadata) could vary in thousands of different ways, and be stored in databases all over the world, and nobody intends to be an authoritative source for any of it. They just want to agree that they're talking about the same thing. Full stop. And since the mechanism for generating the identifier derives it from the subject, there can never be any controller making decisions, by definition. Uncontrollable things exist, and we identify them. Since there is no resolution, there is no controller. That's what the controller controls--the DID doc/resolution, n'est-ce pas?

Now, you could say, "No. We must have a DID doc. 99.999% of DIDs are worthless without it. For the weird corner case you're bringing up, if it really exists at all, just keep the convention and live with the weirdness." That would mean that we analyze the first person to create the guaranteed-empty DID doc for malware X as its "controller" for the purposes of the DID ecosystem. But two researchers could discover it independently, with no way of proving who was first. So we have a theoretical controller role that is unavoidably ambiguous, just because we want to keep the concept of controller. If we instead say, "Yep, there's cases where something exists but its metadata is not controlled, and DIDs can point to them. In such cases, it becomes impossible to create a DID doc (because if you do that, by definition you're exercising control), but it's still a sort of DID because it's a decentralized identifier" then we get to broaden the conceptual tent of DIDs a bit.

dhh1128 on 18 Mar 2020

It is not clear to me yet how this interacts with methods - perhaps some methods are capable of representing passive, persistent objects, and other methods are not - because the owner of a set of keys may always be able to update "the DID document".

On the other hand - they can't update the did itself once it is minted - and that is what matters. So the controller (of the registration) might only be able to update information about the subject's registration record.

Another thing that I worry about is that a did:<method>:<data> model - if the <data> is related to the genesis key pair, then that method might not be capable of representing the "virus hash" you described above. The virus hash could only be represented as an assertion (or a VC)

In other words, for methods where the <data> part of the DID is derived from the genesis key pair, then the document belongs to the discoverer and the ability to represent an arbitrary thing, in a self-certifying manner, is simply not possible in that method. Those methods are constrained to represent "loci of control", which is an undeniably critical group - and the one which has come to dominate our thinking and discussion of DIDs.

This is especially apparent in the context of "verification methods" (i.e. #190 ) when combined with the new abstract-data-model/registry approach. The ADM/Registry model forces a "union of all possible realities" model and results in very complicated modeling. For example, we will need to put this sort of information in the registry:

capabilityDelegation field, if present, means <x>, but it MUST not be present when the method supports non-key hash registration in the data component of the DID and the data component of the DID is not directly related to the control mechanics of the underlying method. If, the method does not support arbitrary hash registration, then the capabilityDelegation field MAY be used, subject to the definition above.
assertionMethod field, if present, means <x>, but it MUST not be present, when the method supports non-key hash registration in the data component of the DID and the data component of the DID is not directly related to the control mechanics of the underlying method. If, the method does not support arbitrary hash registration, then the assertionMethod field MAY be used, subject to the definition above.

or the ADM/Registry needs structure like

for methods which allow arbitrary hash registration and for those DIDs which do not correspond to genesis keypairs (e.g. method1, method2, etc.), then
- capabilityDelegation field MUST NOT be used
- assertionMethod field MUST NOT be used
for methods which allow only DIDs derived from genesis key pairs
- capabilityDelegation field is OPTIONAL and means <x>
- assertionMethod field is OPTIONAL and means <x>

use of an @context field simplifies this substantially by allowing a DID-document to declare the semantic which applies - as in "This subject represents a locus of control" or "This subject was discovered and represents an external entity" - and, of course, depending upon the method - it may or may not be possible to render the DID document formally immutable - in which case, an actor capable of updating the DID document could "morph" the "sort of thing" the DID represents.

What this suggests to me - in answer to @dhh1128 's question

Would we be comfortable saying that DIDs can be used to identify such things, too? And if yes (which I hope is an easy answer), are we willing to not describe such a scenario as "the controller creates the DID" but rather "the DID identifies something inherently uncontrollable, so it never has a controller, even during creation; rather, it has a discoverer" (or something to that effect)?

Is that it is far from clear whether or not DIDs are suitable as generic identifiers for self-certified content. Perhaps DIDs are always and only statements by actors, about people, organizations, and things - which means

update/retract (#213)
disallow https://github.com/w3c/did-core/issues/199#issuecomment-589401616
allow https://github.com/w3c/did-core/issues/199#issuecomment-588686738

Alternatively we could move the semantics partially into methods. Perhaps we could have did-core define a set of "classes" of DIDs - each with it's own ADM/Registry/@context and let methods subscribe to them somehow - perhaps with an @class attribute which names the appropriate did-core semantic model. If we did that we could possibly

keep (#213)
allow https://github.com/w3c/did-core/issues/199#issuecomment-589401616
allow https://github.com/w3c/did-core/issues/199#issuecomment-588686738

The most radical suggestion would be to step out of the battle altogether, and give DID-documents a sort of "sovereignty" and let them announce what they are and how to process them using some sort of attribute that identified and advertised feature and property sets. The proposed attribute would let the creator of the DID-document assert things like

this DID represents a locus of control
this DID is an actor/agent's description of a context
this DID represents a discovered digital artifact as not controlled

and on and on - at the discretion of the environment and suitable to the needs of adopters.

We could even say "if a DID-document says nothing, then it is assumed to follow the rules in did-core" and provide a fallback Abstract Data Model that clearly defines what it ought to be.

ewelton on 18 Mar 2020

@jandrieu re: 1b - i believe 1b is specific to the controller attribute in the DID-Doc, not the qualitative ability to control the DID-doc, simply the explicit representation of it in the DID-doc.

if there is always a controller, then the hypothesis that starts this is not possible. DIDs can not represent immutable content, they can only represent loci of control - and as such they can not really refer to things - they can only refer to the controllers name for things.

In other words - "The Moon" can not be the subject of a DID I create, "What Eric Thinks of as The Moon" is a proper scope, but "The Moon" is not.

ewelton on 18 Mar 2020

Quoting Joe from issue #122 :

DID Controller is a functional definition. Any entity that can actually control the DID Document is a controller.

So if my theoretical DID method exists, in which it's impossible to place any metadata in the DID Document, or to create one at all, that would imply that there cannot be a controller, because nobody can perform the function that satisfies the definition.

This begs the question, of course, is the DID method that I posited allowed to exist? I can think of many use cases for it. It's highly decentralized (would score great on many rubrics), but by its lack of resolution support, it is definitely an odd duck.

dhh1128 on 18 Mar 2020

What you are talking about is not a DID. It's just an identifier.

Obviously, there is still a discussion going on about what constituted meta-data. And, to my mind, I want ALL meta-data out of the DID Document. What needs to be in the DID Document is the cryptographic material for secure interaction (everything else is meta). In some cases, that material can be deterministically derived from the DID itself, like with did:key, in which case resolving the DID is how you transform the raw DID into the DID Document.

I think a big part of what's happening right now is people wanting to do EVERYTHING with DIDs, and I agree DIDs can refer to ANY subject. But that doesn't mean they are the right tool for every single identifier use case nor is it appropriate to pollute the core spec to support convenience features. They can be addressed in DID-AWESOME instead of DID-CORE.

If your identifier is most appropriately generated by hashing the object, GREAT. Just use that as an identifier. No DID required.

The fundamentally topological shift in DIDs over other forms of identifiers, including cryptographically verifiable ones like public keys, is the level of indirection between the DID and the cryptographic material, allowing for appropriate maintenance like rotation without invalidating the DID and auditing of transitions in material over the lifetime of the DID. Without that level of indirection, which is the fundamental link between DIDs and DID Documents, then you don't have DIDs, you just have an identifier.

jandrieu on 18 Mar 2020

👍2

@ewelton wrote

In other words - "The Moon" can not be the subject of a DID I create, "What Eric Thinks of as The Moon" is a proper scope, but "The Moon" is not.

That's all it ever could be.

The singular notion of "The Moon" doesn't exist. That is just what English speaking people, aka Eric, sometimes use to refer to the Earth's natural satellite. Other people use other terms.

This is the fundamental shift that VCs gaurantee. All you can ever say are statements that "some issuer asserts some 'fact'", which is exactly the structure above. This is epistemologically rigorous. Imagining that "The Moon" is, in absolute knowable truth, the subject of a given DID is not. In order for such a statement to exist, we would first have to rigorously understand what "The Moon" really means to you. Then what it really means to me. Then we might be able to convince ourselves that we are talking about the same thing.

It's the same with DIDs. The only way to know if the subject is what you think it is (unless you are the controller) is to gather enough assertions about that DID to convince you of what the Subject is. And EVEN then, all you have done is convince yourself.

Reality is fundamentally unknowable. All we can do is invest resources convincing ourselves of enough shared agreement to interact reasonably.

So, this isn't about a search for Truth with a capital "T". That's a fools errand. Rather, DIDs are a rigorous mechanism to establish cryptographically secured interactions with an arbitrary Subject. Figuring out what that Subject is or is not happens at another layer, including the mechanisms that embody what it means to "interact" with the Subject.

jandrieu on 18 Mar 2020

👍2

@jandrieu I believe there is more to it than 'just an identifier' - it is more than a UUID, because it is linked to the thing itself. It is suitable only to 'hashable' objects, and not physical objects. You can't hash a tree, you can't hash the moon - and, you can argue that you can not refer directly to "the moon" - there is a huge tradition in philosophical semantics about exactly this - and DIDs, in a sense are taking a deep philosophical stance.

So far - what seems like it works is this:
1 - DIDs can not be used to identify digital content in a shared namespace
2 - DIDs can not refer to things
3 - DIDs can be a specific actor/agent's name for a thing

In a sense, it does not matter where this falls - just as long as it falls somewhere and leads to clear and precise (and simple) language. So "the subject is the king of england" for example, would not be quite optimal "actor-x's name for the king of england is did:123" would be the right way to say it.

ewelton on 18 Mar 2020

So "the subject is the king of england" for example, would not be quite optimal "actor-x's name for the king of england is did:123" would be the right way to say it.

Yes. That's what DIDs always say. But since we ALSO don't know who the Controller is, the statement "Controller's name for a thing is XYZ" is rigorously restatable as "A thing is the subject of DID XYZ"

The assertion that DID XYZ refers to the King of England goes in a VC if you want it to be rigorous, in which case you get the lovely construct that "Issuer ABC says DID XYZ is the King of England".

jandrieu on 18 Mar 2020

@jandrieu Nah, I don't quite agree with that. I would agree that saying "A thing is the subject of DID XYZ", while technically rigorous, leads to exactly the sort of miscommunication the community has been having.

I'm not sure I follow the VC comment. Who is ABC and how is ABC related to the construct?

What I'm trying to get to is making it clear, in everyday language, so that it is always apparent that A thing might have dozens of DIDs, because DIDs are "scoped" by controllers - and DIDs can not always serve as points of coordination in a discussion.

What we want for the VC case, and what is being discussed here, is that - given the limitations of DIDs and the incorrect statements about their scope for the last few years - is a new form of identifier that can be shared by communities, and around which we can clearly say "The controller of DID XYZ says DID XYZ refers to N" and "The controller of DID ABC says DID ABC refers to N" and then let DID ABC and DID XYZ rest happy that they are talking about the same N, so that they can have fruitful discussions about attributes of N, such as "cn=King of England" vs. "cn=King of Great Britain"

ewelton on 18 Mar 2020

Okay. I think Joe's given a succinct articulation of a position on the proper scope of DIDs. Thank you, Joe. I love the crispness.

I would like to ask for two things to resolve this issue:

A survey of the group to see if they agree with Joe's rule of thumb.
Assuming yes, a new PR against the DID spec that summarizes the thinking, so future readers of the spec don't wonder whether DIDs apply to their use case. (I am happy to volunteer to raise such a PR, contingent on #1 and on the rest of my comment. Or someone else can.)

Before we poll the group, however, I would like to offer an alternative formulation to Joe's. I don't know if I can be as crisp as he was, but I'm going to try. Going into this, let me acknowledge that the following is heresy, according to the spec; I'm only articulating it because I wonder if we're missing an opportunity here, if we could let go of tightly held notions a little. Here's the alternative worldview:

Lots of identifier schemes already exist. They have various properties. DIDs are unique in that they accomplish ALL of the following goals simultaneously:

Decentralize: allow identifiers to be created by anyone, without permission or coordination.
Eliminate ambiguity: make the referent of the identifier completely uncontroversial.
Provide extensibility: define a methodology whereby new subcategories can be defined without a central authority, yet guarantee that common processing remains viable.

UUIDs accomplish goal 1, but not goal 2 or 3. A given UUID can mean anything, to anybody. Fred can create it, and Jill can repurpose it. They can argue about who's right, or whether they're both right. There is no strong binding to anything in particular. Most decentralized identifiers (e.g., the names of newly born children) are similar.

IP addresses accomplish goal 2, and sometimes goal 3, but not goal 1. Most centralized systems (twitter handles, phone numbers, domain names) are similar.

DIDs accomplish goal 1 in lots of clever ways that I won't go into here.

DIDs accomplish goal 2 in one of these ways:

a. They use cryptography to bind the identifier to a controller. The controller then defines what the identifier refers to. This was the original use case for DIDs, and the one we've thought about the most.

b. They define some other intrinsic property that is objectively observable, that derives the value of the identifier, such that it is impossible for the binding to be ambiguous. A DID that identifies each element in the periodic table by its atomic number would eliminate ambiguity without having cryptographic control, while still remaining decentralized, and while still being enough of a DID to be processed by DID handlers.

Notice that in this formulation, cryptographic control is a means to an end (eliminating ambiguity), not an end in and of itself. Notice also that cryptographic control is just a special case of the other approach (objectively observable property that makes the binding unambiguous). I think that's the crux of the difference between this worldview and the other one.

DIDs accomplish goal 3 through the use of the DID method extension mechanism.

Now that I've articulated an alternate worldview, here's the argument I'd offer in its favor: Although the world needs control-based binding for DIDs in the worst way, it also needs the other kind of binding (which I might call inherent binding). Both bindings are worthy of the moniker "decentralized identifier." UUIDs are not a good alternative because they lack the solution for ambiguity. URLs are not a good alternative because they lack decentralization of domain names. If we force the conception of DIDs to be narrow, we're setting ourselves up for a situation where another type of decentralized identifier comes along that has just as much claim to the word "decentralized", but that thinks about control differently. Result = muddiness and doubt about adoption. If we bring this ugly stepchild into the DID tent and let it take a bath, I suspect it will turn out to be cute and a good family member, in time. I don't think it would take much more than 2 or 3 paragraphs to talk about "uncontrolled decentralized identifiers" in the spec; they're way simpler than the controlled variant.

dhh1128 on 18 Mar 2020

👍3

Tagging a few people who may have opinions about this interesting conversation: @peacekeeper @dlongley @msporny @burnburn @brentzundel @talltree . Please bring in others as appropriate.

dhh1128 on 18 Mar 2020

@jandrieu I think i passed over https://github.com/w3c/did-core/issues/233#issuecomment-600843933 while I was writing my response. and to @dhh1128 's

Now that I've articulated an alternate worldview, here's the argument I'd offer in its favor: Although the world needs control-based binding for DIDs in the worst way, it also needs the other kind of binding. UUIDs are not a good alternative because they lack the solution for ambiguity. URLs are not a good alternative because they lack decentralization of domain names. If we force the conception of DIDs to be narrow, we're setting ourselves up for a situation where another type of decentralized identifier comes along that has just as much claim to the word "decentralized", but that thinks about control differently. Result = muddiness and doubt about adoption. If we bring this ugly stepchild into the DID tent and let it take a bath, I suspect it will turn out to be cute and a good family member, in time. I don't think it would take much more than 2 or 3 paragraphs to talk about "uncontrolled decentralized identifiers" in the spec; they're way simpler than the controlled variant.

I think this is exactly right it was what I was trying to capture with, what we want

is a new form of identifier that can be shared by communities, and around which we can clearly say "The controller of DID XYZ says DID XYZ refers to N" and "The controller of DID ABC says DID ABC refers to N" and then let DID ABC and DID XYZ rest happy that they are talking about the same N, so that they can have fruitful discussions about attributes of N, such as "cn=King of England" vs. "cn=King of Great Britain"

in other words - there is a missing piece to the puzzle. DID's are not necessarily up to the task, unless there is some tweaking to the spec - some core, fundamental tweaking and clarity.

So far, attempts to discuss the missing puzzle piece get blocked by discussion of controllers, subjects, and very obtuse technical issues. Those discussions have cut off the forest and the larger view has been lost. I like the idea of "bringing it into the DID tent, and giving it a bath"

ewelton on 18 Mar 2020

👍1

DIDs don't solve #2

In fact, I don't think #2 is possible in any construction. We can only clarify the DID and when we refer to the DID we can use an unambiguous string of characters.

However, any statements can get attached to that identifier, by any author, and there is no way to know--at the DID level--which statement is "correct". Even if one of the statements is signed by the Controller, you can't be certain that it is "correct". Heck, you can't even prove the controller is the Subject.

What you are bumping up against is essentially Goedel's incompleteness theorem. You can't disambiguate everything. There will always be statements that cannot be proven, no matter how convoluted our schemes may be.

All we can do is anchor assertions by specific issuers to understand (and document) what they are willing to assert about a Subject, as identified by a DID. Statements about the same DID can be taken to be intended as statements about the same Subject, but even then the statements themselves may be wrong.

Content-based hashes of arbitrary content are NOT DIDs because they cannot be resolved directly to some form of cryptographic material. You could, of course, create an IPFS DID Document and have a DID method that uses its content-based address, but that hash is of the DID Document, not of the resource.

IMO, if we are going to get closure on this spec, we need to stop trying to add everything that seems like it might be convenient, and we need to stop trying to construct crazy edge cases--ESPECIALLY if you have no use cases for it (as you put it @dhh).

Maybe others with more experience in standards development can chime in. I know that VCs almost didn't get done because of mid-process shifts to support ZKPs. The consensus was that was a good thing. But it still risked finishing within the required deadline. Kitchen sink engineering a solution that solves everyone's problems is, IMO, an anti-pattern in a standardization process.

We need to be here locking down the simplest feature set for maximum interoperability to do the fundamental thing that DIDs do: enable cryptographically robust management of identifiers without reliance on central registry entities to keep track of who controls what. EVERYTHING else is superfluous and deserves a critical evaluation about whether or not we can remove it and still achieve the fundamental requirement of this work. EVERY add-on is another lengthy drawn out debate, additional implementation complexity, and yet another point of confusion for anyone who wants to adopt the tech. So, let's stop with the add-ons and start focusing on what we can do to minimize the complexity rather than exploring how we can extend DIDs to do extra magic. If DIDs can do that magic, it is perfectly fine to add that at another layer or in the next iteration of the spec.

jandrieu on 18 Mar 2020

👍4

@jandrieu would you then be backing this

1 - DIDs can not be used to identify digital content in a shared namespace
2 - but allow https://github.com/w3c/did-core/issues/199#issuecomment-589401616
3 - clarify that DIDs can not refer to things, only a specific actor/agent's name for a thing
4 - summarize the structure as https://github.com/w3c/did-core/issues/199#issuecomment-588686738

does that seem right?

ewelton on 18 Mar 2020

Um... no.

DIDs can identify ANYTHING. I've said this before, so I'm surprised you'd suggest I'd back that set of statements.

jandrieu on 18 Mar 2020

My particular point here is that the are mathematical guarantees we can affirm with DIDs. That's what the cryptography gives us. Anything more than that which we can mathematically guarantee should be achieved at another level.

jandrieu on 18 Mar 2020

@jandrieu ok, so it seems like we're stuck.

It may not be possible to discuss DIDs.

Either a DID subject refers to ANYTHING and NOT a name for a thing scoped by a controller. But I have a feeling that if I say that it represents ANYTHING then you will say that it is scoped by a controller. I am getting dizzy.

If I was trying to describe DIDs to clients and customers (which i have stopped doing by the way) I need to be able to say something - if I say that "the subject is the King of England" to them without clarifying that there is a controller involved, they get the wrong idea. So I try to say "the subject is scoped by the controller" and then you say "no, I am suprised you said that" - I really am totally at a loss.

A DId subject is both scoped by a controller and not scoped by a controller and it is sometimes anything and sometimes restricted. I just don't get which set of constraints are in play - other than jsut not what anyone else is saying.

ewelton on 18 Mar 2020

DIDs don't solve #2...any statements can get attached to that identifier, by any author, and there is no way to know--at the DID level--which statement is "correct"... Even if one of the statements is signed by the Controller, you can't be certain that it is "correct". What you are bumping up against is essentially Goedel's incompleteness theorem. You can't disambiguate everything. There will always be statements that cannot be proven, no matter how convoluted our schemes may be.

Perhaps you read #2 a bit too fast?

I'm not interested in proving the correctness of arbitrary statements about an identifier. I agree that anybody can claim any attributes they want about anything, and that it's not useful/desirable for DIDs to facilitate that. In fact, the example scheme I proposed explicitly precludes the association of any statements with the identifier other than existence/scope of reference (the subject). I'm saying that it's a defining characteristic of DIDs that they prove the correctness of exactly one type of statement, which is an assertion about scope of reference -- and I'm claiming that is a generalization of the variant you like, which is scope as proved by cryptographic evidence. Control is only interesting as a mechanism of achieving the real goal, which is knowing with confidence what you're talking about. Your own verbiage "Even if one of the statements is signed by the Controller" presupposes that it's possible to ascertain truth about this subtopic; signing is just the mechanism for proving that the scope of reference is what the Controller, not some other entity, asserts. I think this is exactly what you meant when you said the DID subject can't be the moon, but can be what the controller thinks of as the moon.

While it is true that eliminating all ambiguity is impossible, and on a philosophical level, we can't even prove that we exist rather than being figments of one another's imaginations, I am very surprised to hear anybody claim that DIDs don't provide practical clarity about what the referent is. Elsewhere you have claimed that the referent is whatever the controller wants it to be. That's an unambiguous binding. Yes, it can change. Yes, the controller can do a lousy or inconsistent job of definition. But the fact remains that whatever scope of reference is embodied in the controller's choices constitute exactly and uncontroversially the referent for a DID at a point in time, if the binding is based on cryptographic control.

Maybe others with more experience in standards development can chime in. I know that VCs almost didn't get done because of mid-process shifts to support ZKPs

I agree that bringing this up and tackling it is a tradeoff. Eric is not alone in believing that if we don't broaden our conception, important use cases are lost. But that could be the right answer, and I would accept it if it's the will of the community (even though I continue to disagree with your other argument). So I, too, am curious to hear how other people would weigh it.

dhh1128 on 19 Mar 2020

👍2

@ewelton I don't think we are stuck. We are just dealing with the fundamentals of what is knowable and what is provable. As such, we bump into issues of epistemology and Goedel's incompleteness theorem. There are bounds on what we can know and bounds on what we can prove. Any technology that purports to exceed those bounds should be considered with the same skepticism as claims of a perpetual motion machine.

That said, it is a different issue how we talk to regular folks. In the same way that it is hard to explain why perpetual motion machines will never work, it will be hard to explain the boundaries of what is knowable and provable.

jandrieu on 19 Mar 2020

@jandrieu I understand how you frame it and why you say what you are saying. But there are practical solutions to the problem @dhh1128 raised. More importantly, we just need to pick one and move forward.

What you are saying is true, but I feel you are simply missing the point of what we are saying, and are convinced that this is because we fail to appreciate your point.

The subject of a DID has no semantics - and, importantly, if the hash is cryptographically bound to the genesis key pair, then it CAN NOT serve the role of identifying digital content in a self-certifying manner. Instead, it can only be the "name" of a record that contains the target identifier.

What we are exploring is a way to augment that environment - to make self-certifying content identifiers first-class citizens. This exploration is not about mathematical provability or Cantor's Paradise.

In terms of did methods - we are starting to see 'strange methods' like did:key - which, one might argue, have a different relationship with 'controllerhood' than do blockchain-resident did methods with long running did-documents that can evolve over time and can engage in complex expressions of verification methods and service_endpoints.

The option on the table is to recognize some of those differences - and instead of rage against them, decide if that variation can be co-opted and exploited.

In a sense it does not matter which is chosen - as long as it is chosen soon, and precisely. There is a strong argument for disallowing this sort of "content-hash" immutable element - like did:immutable:<hash> - and there are arguments for it. It is not the case that it is fundamentally impossible due to the Principle of Least Action driving the inherent increase in Entropy we commonly experience as the Arrow of Time - it is a pragmatic decision for the spec.

ewelton on 19 Mar 2020

👍1

An object in a decentralized network needs an identifier. The DID name itself "Decentralized Identifier" suggests that there should be room to include a solution in the DID spec.

pknowl on 20 Mar 2020

👍1

... and, for the record, semantic objects should never be governed in a decentralized network. That is why _schema.org_, etc. are open-access and free of governance. If semantics are governed they simply won't be adopted.

pknowl on 20 Mar 2020

I may be missing the point. I certainly don't understand what @dhh is trying to get at with disambiguating. But I also don't understand your previous comments. We _can_ talk about DIDs and your suggestion that I would support those three items you listed made it seem like you didn't understand my point. If you do, great.

I really don't understand how #2 is accomplished, in any identification architecture.

Eliminate ambiguity: make the referent of the identifier completely uncontroversial.

@dhh later expands that to

I'm saying that it's a defining characteristic of DIDs that they prove the correctness of exactly one type of statement, which is an assertion about scope of reference -- and I'm claiming that is a generalization of the variant you like, which is scope as proved by cryptographic evidence.

I'm still not following. The referent is not scoped by the DID. Rather, a link to a certain set of cryptographic material is provided by a DID Method after resolution.

That's it. What's what DIDs do. Resolve up a DID and you'll get some cryptographic material that can be used to interact securely with "The Subject" whatever/whoever that is. Maybe it is the controller. Maybe it is not. It isn't well scoped at all. It can even change over time. It is completely ambiguous what it refers to.

The only DID that resolves ambiguity is this hypothetical did:immutable. Which doesn't seem like a DID at all to me. So, yes, you can change the definition of DIDs to add something like did:immutable. But you can't say DIDs have a primary function of removing ambiguity--and then use that to justify an argument FOR did:immutable--because no other DIDs do that.

Don't get me wrong: immutable ids are cool. iid:[hashtype][hash] seems like a reasonable thing to standardize. github.com/w3c-ccg/multihash seems like it's half-way there.

I just don't think that's a DID in any sense that this community has been working on.

Maybe I am missing something. In any case, I'm definitely not following the logic on how did:immutable and its kin is anything like other DIDs.

Also... I'm not raging. I'm just disagreeing. DIDs are a thing. They aren't everything. They don't solve all the identifier problems. They are not the right identifier for every kind of thing that might need an identifier. They are a particular type of identifier that might be useful for certain things. Their key distinction is the ability to find the current authoritative cryptographic material for interacting with the Subject of the DID.

Before DIDs, there was not a particularly good way to find such material, not in any definitive way, without reliance on a third party. PGP's web of trust was the best prior art in this area. DIDs are a huge advancement in the usability of cryptography for a large number of use cases. It would be great if we could just focus on getting this fundamental innovation in the books, so we can turn our attention to building the amazing services on top of DIDs that so many of us are excited about.

jandrieu on 20 Mar 2020

👍2

The name DID should really be DEI (Decentralized Entity Identifier). DID suggests that you can identify anything in a decentralized network. If an object identifier cannot be accommodated, the name DID is misleading which is a shame. We would also have to build out an entirely new standard for a DOI (Decentralized Object Identifier) which of course can be done.

In an ideal world of DIDs for everything in a decentralized network, you would have did:e:<hash> for Entity and did:o:<hash> for Object.

Which way are we going to go?

pknowl on 20 Mar 2020

👍1

@jandrieu : I think we are talking past each other because we are talking about different manifestations of ambiguity -- and it might be because of my own clumsy language. If so, I apologize. Let me try again. And let me step away from DIDs for a minute; maybe a different context will help.

Suppose, one day, that Alice invents a brand new word: "habapookajar." She's at a party, and she applies it as an adjective to a person wearing expensive Italian clothes. Those who overhear her are pretty sure it means something sort of like "sophisticated" -- but they're not quite sure. Her meaning is ambiguous. Even if they ask Alice what she means, there's no guarantee she'll tell them the truth, or be able to give them a definition that perfectly embodies her intentions.

This is ambiguity, and I believe we're in alignment in suggesting that it's fundamentally unresolvable. Let's call that "type 1" ambiguity for a moment.

But at least we know who's the definitive authority on the meaning: Alice. Whatever she says it means, we have to accept. There's no ambiguity about that, right?

Or is there? Suppose there's another party a week later, and Bob is overheard using this word. Someone asks him if he got it from Alice, and he says "No, I invented it. Who's Alice?"

Although all ambiguity has things in common, this new ambiguity feels like it's worth putting into a second bucket. Let's call it "type 2" ambiguity. This is not ambiguity about what the word means; it's ambiguity about how to approach learning the word's meaning; we don't even know where to start.

No identity systems can resolve type 1 ambiguity.

A centralized system resolves type 2 ambiguity because the system is the acknowledged authority on the question of what the identifier refers to. That doesn't make the identifier's meaning perfectly clear (nothing can) -- but it removes any ambiguity about how to learn more. But type 2 ambiguity has always been a big problem in decentralized systems, because there is no such authority.

Part of the genius of DIDs is that they solve this problem. That's a hugely valuable innovation. We've explained that innovation in terms of cryptographic control, and if we choose to, we can continue to explain it that way. We can say that the problem is proving control, and the solution is cryptography.

But what I'm suggesting is that we can define the problem in a slightly more general way, and that this might have nice consequences. It would be a tradeoff, as you say.

Old problem statement: How do I prove control of the identifier?
Old answer: With cryptography.

New problem statement: How do I eliminate type 2 ambiguity?
New answer: So far we've imagined two ways. One is to prove control with cryptography. Another way is to derive an identifier from objectively observable properties that remove all ambiguity. Maybe we'll realize there are other ways, too.

I admit that this new formulation is a departure from the official party line. The arguments in favor of it that I'd offer are:

It explains cryptographic control's desirability from first principles, not as an end unto itself. That feels deeply true/correct to me.
It is open-ended, but not infinitely broad. It claims for DIDs all conceptual identifier territory that intends to be decentralized but not type 2 ambiguous. UUIDs and numerous naming schemes fall outside the scope for clear reasons, but other mechanisms could be discovered that have the defining properties. Maybe we'll learn something. It would be nice not to have to start a new standard when the one we've already built anticipates such possibilities.
It allows me to leverage the hard work that's been done on DIDs to solve a whole new set of problems that are currently ruled out by the insistence that DIDs must be based on control of the identifier. This set of problems has been simmering in the background of DIDs for several years now, with people never quite able to explain why they felt misaligned. It's now late in the process, but I finally feel some clarity about why the disconnect and what a solution might be. The identifier variant that derives from objective properties is a type of identifier that's "discovered" rather than "created", and I intuit (but cannot prove) that we may come to love that type of identifier and want it under the DID umbrella.

I don't think these three arguments are a slam dunk argument in favor of what I'm proposing. But I'm hoping that at least my worldview and my comments about ambiguity make better sense?

dhh1128 on 20 Mar 2020

👍3

Spelt out, the two options are doi:<hash> or did:o:<hash> for an object identifier. That should probably be put to a community vote.

pknowl on 20 Mar 2020

Wow, what a thread. That we can be having this deep a conversation about identifiers so fast that if I take my eye off the list for 2 days I miss the whole thing...amazing.

I just now have taken the time to read the whole thread, from top to bottom. Here's my thinking. As someone who has worked on DIDs from the very first version of this spec four years ago, let me put it this way:

If this had come up in the first three years of the spec development, I would have agreed with Joe, i.e., DIDs are about cryptographic verification of identifiers—and thus they always have DID docs—end of story, go away.
About a year ago, or whenever did:key: came along, it was a shock to the system. A DID that did not have a DID document, but generated a DID document. Whoa! That was a head-banger. At first I said, "Hell no". But then I listened to folks and thought about the use case...and finally "widened my thinking" about what a "decentralized identifier" should be because clearly did:key: was valuable.
Ditto all of the above for did:web:. I still personally find it distasteful, but I can see the use cases, and even in that method, due to the presence of a DID document, there are still ways of using the cryptographically-verifiability to work around the fact the method has a highly centralized component.

So that brings us to did:r:. When I first heard about the concept, I said, "Oh, it's just a content hash expressed as a DID. There's no DID document. There cannot be a DID document. So clearly that's not a DID." But then, like all three examples above, I stopped and listened to the use cases and started thinking about it. And I found myself agreeing with @dhh1128 that the overall concept is simply a different application of cryptographically-verifiable identification.

In other words, if you look at a did:r:, it is indeed cryptographically verifiable, but not through its cryptographic association with a public key, but through its cryptographic association with the DID subject.

That is still a cryptographically-verifiable identifier. And it's still decentralized.

And it's valuable because I have spent time with the proposers of this new method and they have a MOUNTAIN of use cases for it. Entire industries might end out being built around this particular "branch" of DIDs.

And so I've "widened my aperture" once more and now agree that including content-based verification methods as valid types of DID methods makes sense. Even if they explicitly do not involve any DID document.

So I urge not just @jandrieu but all members of the WG to take a close read of this thread and see if you agree. And if it would help, perhaps the proponents might host a special call or a webinar to explain their use cases in more depth as that would probably help too.

talltree on 20 Mar 2020

👍2

@jandrieu

In some cases, that material can be deterministically derived from the DID itself, like with did:key, in which case resolving the DID is how you transform the raw DID into the DID Document.

Following that statement I would say that did:r:<multihash> is same thing, I can generate DID Document out of it following defined rules which can give me the same sort of DID Document (if I am not mistaken valid did doc can include only id). But in many cases I just don't need DID Document which does not make the DID useless.

Many people think that DID points to DID document from where you can learn more about what you can do next. But as we know that is not true. DID points to the key and each method defines how to "construct" DID Document out of given DID. E.g Some ask you to look on the ledger to get the document, which is not different from having it in some sort of immutable DB. Others like did:key allows you to derive the DID Document out of the method-specific-id. If this is commonly accepted pattern did:r would follow same rules. Just in many cases in DID Document there would not be much meta data, in many cases it would be just empty. Or I could use

Current spec defines DID as:

Decentralized identifiers (DIDs) are a new type of identifier to provide verifiable, decentralized digital identity. These new identifiers are designed to enable the controller of a DID to prove control over it and to be implemented independently of any centralized registry, identity provider, or certificate authority.

This what I would like to see is something like this:

Decentralized identifiers (DIDs) are a new type of identifier to provide verifiable, decentralized digital identity. These new identifiers are designed to enable cryptographically-verifiable identification and to be implemented independently of any centralized registry, identity provider, or certificate authority.

Why?
The reason is simple so called DRI and DID have a lot of in common on high level they are the same thing. Getting DID spec generic enough is beneficial for everyone. This is what makes standards really powerful that they can unify parts without limiting the use of it to specific use case. In the current DID spec the specific use case is to use DID only for the purpose of controlling specific DID Document. And not even that as we already saw the movement that the "controlling" part is optional so you can have DID Document which cannot be altered after creating it. Seems that we are just on small step away from DRI in that situation.

@talltree gave a very good example how the story changed over time while thinking about what DID should/could be. In my opinion This shows how DID slowly getting mature and people realizing that the problem which they started with in first place can be apply to broader space.

As @dhh1128 already mentioned the reason why we even having that discussion is that DID seems to overcomes any other existing standards due to it's specific properties:

Decentralize
Eliminate ambiguity
Provide extensibility

And the common denominator for all above is cryptographically-verifiable identification as @talltree pointed out. For me personally this gives very solid standard which if it would be adopted would give community a lot of benefits as your wallet/agent/system/website/you name it, implements one standard for cryptographically verifiable identifier and you can support people identity, things identity, content identity. Without saying how this identity is defined.

There was already a statement that DID allow to identify anything, yes but through DID Document which in many cases is just unnecessary step e.g. for identification of the content. Which could happen that reveals to much.

mitfik on 20 Mar 2020

👍2

I also just read the thread, and even though I find the idea intriguing to broaden the concept of DIDs to all _"cryptographically-verifiable identification"_, without DID documents and without DID controllers, I still have a preference for @jandrieu's perspective, and for sticking with the current "party line".

Unfortunately, we have monopolized the term "Decentralized Identifier (DID)", when in fact there are other "decentralized identifiers" out there. But still I believe we should stick to the mental model that all DIDs are "created" and "controlled".

(Side note: in very practical terms, deviating from this may mean deviating from the DID WG charter - see the first few bullet points).

I would argue that there are existing and better "decentralized identifiers" than DIDs, which can already be used for identifying malware or elements in the periodic table. Those identifiers are URNs.

Magnet-Links have used urn:sha1:xxxx for a long time. You could create a more modern version urn:multihash:xxx, and I think you got what this thread is all about? "Cryptographically-verifiable identifiers" that are "discovered" rather than "controlled"?
You can always create new URN namespaces for other things like elements in the periodic table, e.g. urn:atom:xenon or urn:atom:54. Sorry, "URN namespace" doesn't sound as cool as "DID method" :(

peacekeeper on 20 Mar 2020

👍4

My concern is that if we don't introduce did:o:, we may alienate certain industry sectors which would damage the global appeal of DIDs. For example, the biggest Pharma DLT consortium project that we are in talks with at the moment are discussing using a platform architecture that doesn't use DIDs at all. If we had did:o: in the spec, they would definitely adopt DIDs and, with it, we could introduce the whole DID/VC flow into their architecture. Without that DID method, they have absolutely no need to delve into the world of DIDs, which compromises our overall desire to build a truly interoperable decentralized data economy. No pressure, huh @peacekeeper

pknowl on 20 Mar 2020

@pknowl I understand that we may want to broaden the scope of DIDs as much as possible for marketing reasons, and I'm not strongly opposed to doing that. But from a purely technical perspective, why couldn't that consortium participate in a "truly interoperable decentralized data economy" with URNs, if they want identifiers that don't need controllers or DID documents?

peacekeeper on 20 Mar 2020

@peacekeeper Should URNs be used as a universal standard for object identifiers? Surely, a DID should be able to cover all 3 valid identifier states.

grid-reference

In terms of key interdependency, we're already taking a hash of content in the "dependent" state. It sounds weird to disregard content when no entity identifier is referenced. It almost feels like we're turning our close cousins away for Thanksgiving dinner!

pknowl on 21 Mar 2020

@pknowl — can you unpack your chart a little bit more for us? I.e., can you explain more fully:

What you mean by "key dependent" and "key interdependent"?
What is the difference between "trusted" and "immutable"?
What is the difference between an "entity identifier" and an "object identifier"?

I suspect this is all very clear in your head, but the chart is so terse that I suspect others will have a hard time grokking it without those explanations.

talltree on 22 Mar 2020

👍2

@talltree - No problem at all.

1.) "Key dependent" means that an identifier is governed by an entity and therefore a signing key is required. "Key interdependent" means that an object identifier can either be governed by an entity (whereby a signing key is required) or not governed (no keys required).

2.) "Trusted" means that an identifier is governed by an entity and therefore a signing key is required to establish trust. "Immutable" means that an identifier contains a hash of the content of an object which cannot be changed. If an object identifier is governed, the controller of the signing key has control over the content contained within the associated DID-document and, as such, it can no longer be deemed immutable.

3.) An "entity identifier" is an identifier that is governed by an entity who controls the signing key. An "object identifier" is an identifier that contains a hash of the content of an object.

pknowl on 22 Mar 2020

@talltree Perhaps this network model will help visually.
network.pdf

You'll also notice that I've changed any reference of DRI to DOI and any reference of did:r: to did:o: in my previous entries. My apologies for that. I only cracked the code after already having joined the thread. Anyway, all corrections made. The kernels are now solid.

pknowl on 22 Mar 2020

@pknowl Thank you very much—those terms are exactly the key I needed to unpack your table.

So one way to sum up what you are proposing is to unify the world of controlled DIDs with a new world of uncontrolled DIDs by bringing into the DID world the concept of a multihash identifier.

Now, I have another ask of you (if you are willing). One reason that I suspect for @peacekeeper 's hesitation about bringing the world of "uncontrolled DIDs" into scope is the question of resolution. Given that there is no DID document, have you thought through what a DID resolver would or should return when given a did:o: to resolve?

Could it, for example, return possible network locations of an instance of the identified object? Or any other useful information about the object?

talltree on 23 Mar 2020

@talltree There are mindset cabinets that will be unlocked here so I'll try to approach this topic from a place of practicality.

By the very nature of "key interdependency", in the case of an _object identifier_, some of the did:o: space is already being encroached upon. That is the first issue.

The second and much broader issue is the resolution process as a whole. The method name currently supports 52 different method types. That number will grow exponentially when the world of DIDs hits the masses. This issue can be resolved by moving any location information away from the method space. If that approach were adopted and did:e: and did:o: became the two permanent method names, entity identifiers and object identifiers could be treated autonomously which would resolve any fragmentation issues. _(... not to mention enable an interplanetary solution as Elon Musk endeavours to colonise Mars!)_

@mitfik and/or @ewelton can better explain what the resolution process might look like but I thought that I should bring up the elephant in the room as "if not now, when?"

pknowl on 23 Mar 2020

From my perspective DIDs were at a crossroads around Sept/Oct 2019. At that time we had an opportunity to view DIDs as a sort of universal namespace with certain properties linking the identifier to an underlying asset using clever cryptography. This is what made them different from identifiers like UUIDs.

There were two other properties that were attractive:

open resolution w/o a central authority - so DIDs could span all sorts of domains and contexts, and the only commitment made when "using a did" was the guarantee that you could retrieve some sort of information about the DID - e.g. resolve(did)
open semantics which allowed DIDs to fill a huge number of roles, one of which included the pool of verification methods and the service endpoints that define DIDs today

I believe this conversation fits that time in the history of DIDs better than the DIDs of today - which are much more focused. In recognition of that focus, I actually favor @peacekeeper and @jandrieu 's sensibilities around focusing "what a DID is" - but I still think it is worthwhile to present the motivation behind the did:e and did:o concept.

The DID method space still does contain a number of "odd ducks" - like peer DIDs (which I think are critical, but which are definitely a different breed), and did:key or did:web, which strain the idea of "what a DID is" in a different direction. So even if the ideas are late to the party, nothing is cast in stone and it can't hurt to present the ideas.

The core idea is that there are three categories of things in the digital universe:

immutable digital assets, which can be hashed - these are "found objects" - they can be tracked, but not controlled because there is nothing to control as they can not change.
active mutable digital assets representing the extension of an "actor" into digital space - this is what happens when a person or something that is "capable of agency" registers a public key somewhere, sequesters the private key, and uses that fact to bootstrap a digital presence based on that key pair
passive mutable digital assets which are the digital projection (or are inherently digital) of the abstractions and physical objects that we use in our daily lives - a parcel tracking engine, or a flight reservation record, or a collaborative document, and so on. These, by the nature of being passive, require (2) to act on their behalf - yet they also serve as points of coordination and correlation, which is essential to the role they play.

The distinction between 2 & 3 is captured well by the conceptual work around DIDs, with all sorts of wonderful nuances about key rotation and control and verification methods and the like being worked out. This model is still not well understood outside of the DID community, but I think that the DID community has the kinks pretty well ironed out.

DIDs bring a lot of other baggage to the table and is not clear whether that baggage is worth the trouble, since DIDs are not required for credential & trust technologies. You do not need DIDs for credentials or capabilities, and ubiquitous, low-effort, low-barrier to entry, existing network technology can achieve the same level of cryptographic integrity. So what is the attraction of DIDs?

For a while the attraction tto me was that it could create a unified namespace for the above three classes of "network things", with a "common API for resolution" at least as perceived by an application programmer. The programmer would just install the right package and call resolve(did) - and this would bypass any deep or intrinsic reliance on DNS. We finally had a simple, level playing field for asking a bootstrapping question about resources on "the network" - which included centralized, classic, decentralized, and P2P spaces.

DIDs had the possibility to be a generic identifier that could point to anything, support a simple API for resolution (not that the implementation of that API is simple), and open the door for the global community to expand both resolution and semantic and to develop tools, facilities, and services around a "new kind of content and integrity focused address"

The current DID spec is one species, one subset of that broader vision - and I think the steps taken in the process of moving to the working group effectively crystallized DIDs as that subset. This is not any kind of indictment or complaint - it is just that the broader community is catching up to what is going on with the DID authority and so some of these broader concepts keep coming to the party.

For a while we had a shot at an internet landscape where "the essential question" asked by applications was resolve(did) - perhaps rivaling fetch(url) or lookup(host) - and this would have been a glorious development. As it stands today, DIDs are suitable for a more specific range of use cases. They will operate in parallel with non-DID components of the identity, credential, and trust technology landscape.

The issue of resolution is critical - and the fact that it is not a centerpiece of the charter, is, I think, a mistake. There are tremendous challenges towards implementing resolution for the current, restricted DID model - and those challenges are driven by the large number of methods. The large number of methods fosters "wallet siloing", where application builders choose just a subset of methods and decide if they also support other URIs - or perhaps the culture will become dominated by remote universal resolvers, which, in the years to come, might be deployed as thickly as Akamai and Cloudfront cache servers.

It is too early to tell - and I am confident the community can solve the problems. The question that plagues me is not "can we solve it" but "what is the cost/benefit ratio" - and this rests on the scope of DIDs. For a broad enough scope, a high price is attractive - but when the scope is limited, the decision is less clear.

In many ways, the focusing of DIDs is a benefit to everyone - it will help ensure the spec gets out, is cleanly done, and is well managed and under the central authorities of the W3C, W3ID, and other registry, context, and guidance providers. It will ensure that people vested in global blockchains will be well supported. It will be the anchor in the sea of chaos surrounding DIDs.

The focusing of DIDs also opens the door to a new frontier seeking identifiers that can span DIDs + centralized resources + holographic resources + immutable resources (on various storage media, like IPFS, blockchains, etc.) - and that broader space of identifier technology will draw heavily from much of the excellent work that has been done on DIDs.

It is a win-win situation, no matter how this issue plays out - the only way to lose is to keep the issue open for too long, and the essential remaining question, for me at least, is resolution.

ewelton on 23 Mar 2020

👍1

@pknowl

You'll also notice that I've changed any reference of DRI to DOI

I think you may have to change it again :) https://www.doi.org/ = "Digital Object Identifier"

peacekeeper on 24 Mar 2020

@peacekeeper More reason to pull did:o: into the DID space!

pknowl on 24 Mar 2020

Just wanted to hop in here and offer a point of view. While I believe that these CIDs are useful, I find that it's possible to represent the same data while keeping the mental model of a controller existing and abandoning the did document.

For example, let's say I had did:immutable:b85dca566725ca2d1baee467a13561af1346953a7bf281b1e259b172f5c740ab

(that's the sha256 hash of "I know this content")

and it's published to a registry with the following did document:

{
 "@id": "did:immutable:b85dca566725ca2d1baee467a13561af1346953a7bf281b1e259b172f5c740ab"
}

then what I'm asserting as the publisher (and thereby controller) is that I know of some content capable of producing that hash, and I'm registering it for the world to know about. Furthermore, I'm saving myself the step of doing this and then revoking a key.

Put another way, I could achieve the exact same outcome by doing this with did:sov (with some crafty extensions)

did:sov:123456789abcdefghi =>

{
 "@id": "did:sov:123456789abcdefghi",
 "publicKey": [
  {
     "id": "did:example:123456789abcdefghi#keys-1",
     "type": "Ed25519VerificationKey2018",
     "controller": "did:example:pqrstuvwxyz0987654321",
     "publicKeyBase58": "H3C2AVvLMv6gmMNam3uVAjZpfkcJCwDwnZn6z3wXmqPV"
   }
  ],
 "knownContent": ["b85dca566725ca2d1baee467a13561af1346953a7bf281b1e259b172f5c740ab"]
}

and then revoking the key such that the did document looks like this:

{
 "@id": "did:sov:123456789abcdefghi",
 "knownContent": ["b85dca566725ca2d1baee467a13561af1346953a7bf281b1e259b172f5c740ab"]
}

What I understand this proposal is suggesting though is that I should be able to develop a did method to make the identifier the content and call it a day. If you really wanted to, I could resolve it and get a did document as well, but it's pretty useless. _Albeit it's still a compliant did document._

In essence, what I'm suggesting is this identifier is cryptographic knowledge from the controller that they know the content that produces the hash. It still has a controller, who abandons control immediately, and are cryptographic mathematical guarantees asserted by the controller on initial generation of the did/did document.

@jandrieu what I understand you're suggesting is that the mental model should work in a way that's in-compliant with normative statements in the text currently. Specifically, the fact that every property of a DID Document other then @id MAY be used. Not MUST/SHOULD. If this is really the mental model we're going to role with, we should be changing a lot of the properties from MAYs to SHOULDs/MUSTs. Of particular note too, if we move to a SHOULD, I don't think that eliminates this usecase because it has "valid reasons in particular circumstances to ignore a particular item".

Even for fun, in the case of identifying when malware was first spotted in the wild, I could use the created property to have some metadata about the subject (specifically the first time it was spotted and registered).

e.g.

{
 "@id": "did:immutable:b85dca566725ca2d1baee467a13561af1346953a7bf281b1e259b172f5c740ab",
 "created": "2002-10-10T17:00:00Z"
}

Again, still a valid did document that complies with the mental model of a controller abandoning the did document.

kdenhartog on 24 Mar 2020

🎉1

Thanks, @kdenhartog - I appreciate you walking through that proposed solution for us.

@peacekeeper @jandrieu - As to not disrupt current resolution processes, perhaps the DIDWG will concede to allow did:immutable: to become a new method type for immutable content with DID-document abandoned. Your thoughts?

Tagging a few people who may have opinions about this suggested way forward:
@dlongley @msporny @burnburn @brentzundel @talltree @dhh1128 @ewelton @mitfik

pknowl on 24 Mar 2020

@kdenhartog I really like your idea above - to me, the genius of having the @context flexibility was that it moved the MUST/SHOULD model around verification methods (and service_endpoints) outside of the core spec (making the core spec extremely light and focused).

The ability of the definition of the data representation of the cryptographic material required for credential and capability processing to exist outside of the DID-spec would create a soft-coupling between DIDs and non-DID URI ids. With a common spec, external to the DID-core, that defined fields like capabilityDelegation and keyAgreement you have broad interoperability, not just internal to the DID universe but also with the world of credentials and capabilities in the large scale efforts actively being built outside of this community.

When viewed in that light, having the DID-spec be very crisp and lightweight, and simply identifying the relationship with some key-specs using an @context (or @context-like) switch - the door is open to easily defining semantics for 'found objects' that clearly express the appropriate model. You could have a 'discoverer' field for example.

You would not want to support 'mutable' data for such an object - definitely not service endpoints, nor cryptographic control material. But it still buys you a lot - having the ability to have a consistent, non-controller scoped name for an immutable digital asset, like a virus footprint would allow malware experts (organizations and individuals) to reliably associate information about how to fix, other ways to detect, and any other manner of derived data about the malware.

Issuers could link information directly about "malware footprint X" and not "controller Y's name for malware footprint X"- currently they could do that with urn:multihash:footprint but that just begs the question of resolution and metadata - where do you get that little hint that "this is a virus footprint, discovered by Y on DATE"? It is possible, but it strengthens the growing sense that DIDs are a fringe element in the world of credentials and capabilities.

In the world where DIDs remain central to credentials and capabilities, the ability to register "found objects" like virus footprints would open the world to subscribing - not to virus detection updates from specific companies based on SSL certificates at the end of https://malwarehelp.co/updates, but to a broader field of expertise (and perhaps more rapid response). You could choose to trust specific sources, and you could accept updates to your anti-malware software based on trust in issuers rather than trust that the SSL certificate and subscriptions.

The key is in the ability to issue a VC about "malware footprint X" instead of "controller Y's name for malware footprint X" - it takes the unwanted controller out of the loop, and does not introduce the need for a centralized malware database at the end of some DNS-mediated URL. This is a step towards Doc Searls' intention economy, and away from the customer-capture-and-control model that dominates the malware industry. It also helps reduce the inherent surveillance in pinging the update server.

The only thing that bothers me about the approach you describe above is the "surrender of control" being in good faith - unless I read that wrong or am lagging in my tracking the current edge of DID thinking. It is an area where KERI, and the transparency that comes with witness records, could record the completion of "surrender of control" - so you could work around the limitations of the DID spec and identify "immutable objects" - being those for which one could document the surrender of control. It would make the abstract data model and fixed semantics of the current DID-core more complicated, but that's what standard processing libraries are for.

One final thought about the benefits of prying the cryptographic material and service endpoints out of the DID-core spec, and into associated specs is that it would help address the tension of having resolution out of scope while the service_endpoints definition and URL discussions are in-scope.

With that lightening of the DID-core burden you get a very nice compartmentalization of concerns, and some excellent linkage with the broader world of identity, credential, and trust-technologies - the ones that will be in play when DIDs mature to the point of broad adoption.

But again, there is tremendous progress being made in the current DID-spec model - and we must take care not to derail that, even if it means that DIDs become a niche solution in the broader universe of URI mediated credential and capability processing.

ewelton on 24 Mar 2020

👍1

Wow, I must admit I am coming to this even later than @talltree and am astounded at the length and depth as well.

I'd like to make some comments at a meta level.

One of the prime killers of standards is scope creep. As one of the chairs, my job is to avoid that where possible. The challenge is that as others become aware of and join the work, it is natural for them to see ways in which they could make "just a small change" that would enable other use cases.

@jandrieu and @peacekeeper are correct in their description of the original intent of DIDs, the intent that spawned the incubation work and then the formal standardization track we are on. (And sorry, @talltree, but the notion of a DID document as something presented/generated rather than (necessarily) stored has existed since BTCR, the very first DID method)

@ewelton is correct that the latest reasonable time for a major perspective shift was when this group kicked off last September. Actually, even that was too late - charter development was the time.

However, I like what I am now seeing - an attempt to describe how to accomplish at least some of the "new" use cases presented in this thread via mechanisms already existing in and envisioned by the spec. This is how standards succeed, by accepting a functional if non-ideal use of the original idea. Remember, we can always do a 2.0 or create a completely different standard after learning through use of this one. In the mean time, an imperfect or limited standard that is COMPLETED is vastly superior to one that undergoes a revamp every year when new perspectives are added.

Soooo, I am not expressing an opinion, at all, on the value of the alternate perspective from @dhh1128 and @pknowl or the specific proposal from @kdenhartog . I am merely suggesting that creative directions like that suggested by @kdenhartog are typically more successful ways to get a new, potentially larger raft of use cases addressed because they dramatically limit scope creep and thus may allow us to finish this within a reasonable time span.

So keep talking!

burnburn on 24 Mar 2020

🚀4 👍1

Thanks, @burnburn . Your post adds a wonderful sense of calm to this thread and gives us the necessary movement and oxygen to continue the discussion. At this stage, the argument is more philosophical than technological.

I strongly believe that a decentralized data network will be much more stable if we only have one identifier throughout the space - _DIDs for everything_. That would allow us to put all other identifier standards out to pasture with a graceful "thank you for getting us to this point."

I'm attaching a mini-deck so that everyone can visualise exactly what that means. _DIDs for everything in the big blue circle_. See first slide.
Identifiers.pdf

pknowl on 25 Mar 2020

@kdenhartog Hats off to you for a wonderfully simple twist to thinking about what @dhh1128 and others on this thread have been proposing. The way you describe it, did:immutable: actually strikes me to be much more like did:key, i.e., the contents of the DID document are cryptographically related to the DID itself.

I like it. In fact I like it so much—and it fits within the existing spec so easily—that I propose that one of the proponents write a PR.

talltree on 25 Mar 2020

👍1

@dhh1128 Thanks for instigating a vibrant and much needed discussion.

@kdenhartog Thanks for coming up with a simple solution for us to get our teeth into.

@mitfik and I will start writing the did:immutable: method. Kyle has kindly offered to assist but would rather not be the sole maintainer (which makes total sense as this method really falls on the _Decentralized Semantics_ side of the model). He has suggested that did:key: would be a great basic template to work from and that it might be worth getting the Protocol Labs folks (creators of _multihash_) to join in too.

We'll get something down in writing asap for review.

Thanks, everyone.

pknowl on 25 Mar 2020

We already have:

Magnet Links (Example: urn:sha1:xxxx)
IPFS addresses (Example: ipfs://xxxx)
Hash URIs (Example: hash://sha256/xxxx)
RFC6920 - Naming Things with Hashes (Example: ni:///sha-256;xxxx)

The last item in that list says in the Abstract: _This document defines a set of ways to identify a thing (a digital object in this case) using the output from a hash function._

I agree that it would be possible to come up with DID methods that define the CREATE and READ operations in very creative ways, and by all means, go ahead. I'm just not sure what the use of DIDs really adds, especially if you don't need controllers and DID documents. You can achieve everything with plain URNs or other identifiers that are much simpler than DIDs, while still being interoperable (since everything is a URI).

BTW as a side note, if I understood @dhh1128 's original argument correctly, I think this thread is not only about DIDs that are cryptographically derived from the subject, but also - more broadly - about DIDs that are objectively observable from the subject in other ways, e.g. atomic numbers of elements in the periodic table. I look forward to reading the DID method specifications and how they define the CREATE operations. How are the DIDs for elements in the period table created? Should we make CREATE optional and argue that some DIDs may just "have always existed and are discovered rather than created"? Or did God invoke the CREATE operation for those DIDs? How "decentralized" is that? :)

The more I think about this, the less concerned I am, so please feel free to go ahead!

Just one request regarding did:immutable: I think the method name is not ideal, since it may be misunderstood to mean that these DIDs always refer to the same subject and cannot be reassigned to other subjects. But this is actually a property of all DIDs.

peacekeeper on 25 Mar 2020

👍2

@peacekeeper I do not think that "elements" work, as they are not immutable digital assets, but rather are "objects in the physical world". I believe that mapping those objects makes sense to always have a point of reference - e.g. a custodial owner.

In other words NIST and CERN might define periodic tables of elements - but there is no way for them to agree upon using an inviolate identifier, although they can provide some "claim" that says "NIST claims that what NIST calls Vibranium is the same thing as what CERN calls Vibranium" - as in

did:<nist> asserts that did:<vibranium-1>, controlled by did:<nist>, is 'the same thing as' did:<vibranium-2>, controlled by did:<cern>

this is a claim that can be refuted when it turns out that their values for the Atomic Weight differ by 0.001 units, and therefore may actually be two independent things, or it can be held "as equivalent" in so far as it is useful.

The downside is that there is no way to talk about "Vibranium" without positing a semantic frame of reference. It may seem that the word works, but "Vibranium" is not deterministically resolvable into anything. I think this is a fundamental property of linking the digital and physical world. Truly immutable objects only exist in the digital world - and when they cease to exist they do so wholesale, not through decay and mutation, which changes the hash.

So I believe that there is a deep requirement that the numeric component is derivable from the content, and as you indicate, there are several options available when dealing with these creatures. There is no compelling reason to push them into DIDs-of-today. However, a while back there was a conflict-free pathway to including this sort of uniquely digital object under the umbrella of did resolution. This was something that is not possible with URNs, but is possible with some other frameworks on the list. Thus, if this were this a year earlier, we could reasonably have talked about including a common infrastructure for resolution which would unify the multiple strategies you list, such that resolve(did) was all you ever did - hiding the complexity of 'method of access' along the same lines as we do with did methods today.

However - I absolutely agree that the window for capturing these items in this version of the W3C DID spec has passed - except through some strategy like @kdenhartog suggested, which is interesting in so far as it plays within the rules defined by the central DID authority. ;)

When it comes to a name - I definitely have no strong love of the term immutable - but I am not sure what the correct name is because there is a risk of conveying the idea that they could refer to physical objects - like elements in the periodic table, or abstractions like center of mass, or even numbers - and that would be incorrect. Only objects for which a hash (e.g. urn:multihash:) could be calculated would qualify - because there is a requirement that the DID and the target are deterministically and verifiably linked.

It is this property which can not hold when talking about the physical world, and requires the cryptographic binding and proof of control that DIDs establish.

ewelton on 25 Mar 2020

👍1

@peacekeeper @ewelton This method is specific to a _non-governed immutable object_. Does that trigger a method name that you would be happy with? Open to suggestions but we're confined to those three words.

pknowl on 25 Mar 2020

Let me start off by saying that I don’t think the purpose of DIDs is to encroach on every other identifiers. I think there’s clearly understood concepts that we have no normative statements to enforce these concepts with and that’s what my proposal intended to show, but I didn’t do a great job conveying.

For example, I believe that did:atom:carbon is valid today as I read the spec even though I don’t think it should be. Looking at some of the rules around method specific identifier generation looks to set some mental model assumptions, but doesn’t go so far as to explicitly say this is not possible. I believe this should be changed. I think if we’re changing normative statements for this method, it should be to further constrain what method specific identifiers can be right now so that human friendly identifiers are less likely to occur. Not trying to create an all encompassing identifier because as @burnburn pointed out this is going to make it much harder to get the spec across the line. How we make that testable is going to be the hard part and that’s what I think will take philosophical questions of the abstract into concrete text that strengthens the purpose of dids.

The reason I believe we should be doing this is because I think this will be what brings teeth to the claim “decentralized”.

Additionally I think the purpose of this did method should be used to further explore and show what’s the difference between DIDs and URNs. One of the main aspects in my mind that differentiates the two would be resolvability of metadata.

Where the metadata discussion goes and the normative statements that come out of it could be another place we limit this method. If our metadata approach ends up very extensible, then that will enable more functionality in this did method. On the other hand, if we decide to heavily restrict metadata in the did document as @jandrieu has suggested, I think what I originally described will be the extent of what this did method can cover.

So to make my position clear on this, I find the idea of cryptographically bound content identifiers fitting within this spec either way. I think it fits clearly in our mental model as it stands today. What I disagree with is that we should be creating an all encompassing identifier that can do anything and everything, but as it stands today the spec allows me to do that. If others disagree with that as well we need to start adding normative statements to enforce this now. The best way to do that will be to push the boundaries with conversations like this AND with concrete proposals for changes to the spec. Otherwise it’s inevitable for someone to come along in five years and define did:atom because of some extraneous reason they decided dids were the best way to do this.

kdenhartog on 25 Mar 2020

👍2

The maximum number of letters in any other DID method on the registry is 7. On that basis, I propose that we shave off a couple of letters. I think did:immutab: works perfectly well both in meaning and visually.

pknowl on 26 Mar 2020

The maximum number of letters in any other DID method on the registry is 7. On that basis, I propose that we shave off a couple of letters. I think did:immutab works perfectly well both in meaning and visually.

I'd prefer something more along the lines of did:cid or did:hash personally. I think it doesn't cause the confusion that Markus highlighted above around the subject being mutable. However, I don't think this would support update functionality because the controller abandons control at the time of publishing if we went in the direction of my proposal.

kdenhartog on 26 Mar 2020

I like did:hash:. I think we should avoid any mention of id in the method space as that is already explicit in did:.

pknowl on 26 Mar 2020

👎1

I really like did:hash:

Please, no... imprecise and do you see what Github turns it into, above. :)

Name it something more precise... like did:cid (content identifier) as @kdenhartog mentioned above, or did:chash (content hash)

msporny on 26 Mar 2020

👍1

did:chash: (content) or did:ohash: (object) ?

Going back to the original definition ...
An "object identifier" is an identifier that contains a hash of the content of an object.

In that definition, "hash of the content" is a function and the "object" is a target.

Does the DID method type usually depict a function or a target? That could help steer the final naming decision.

@ewelton may have an opinion here also. He is an expert in physical / digital convergence.

grid-reference

pknowl on 26 Mar 2020

@pknowl Actually - I'm not so sure we need to head out and name this method quite yet. This has been an interesting exercise, but I am not quite sure that a method is ready for prime time.

I still am unclear about two things:

how is control surrender enforced
how does resolution work (e.g. what is the relationship w/ an underlying registry)

I would like to see some clear use cases of these as dids, instead of using the alternatives such as those mentioned by @peacekeeper, which are adequate to the task at hand.

I also want to be clear that the intention was never to simply replace existing identifier schemes for the sake of replacement. The appeal of content-bound identifiers made sense in pre-WG DID days as part of a larger effort to support verifiable identifiers and

avoid tech-stack siloing at the application programming layer (such as "this is an IPFS project")
remove the hidden dependence on DNS for resource resolution

If there is some clarity on how this method would work in terms of resolution, and how we could guarantee that the registered object fingerprint is no longer under the control of anyone (which may not be possible using DIDs) then a method is warranted and could be valuable. Once those are answered, a name might suggest itself, and without answers to those questions I would suggest not making a DID method.

The outcome we want to avoid is adding a method to the pile that looks like it does one thing, but in fact does another. If all we do is create a "regular DID" with an additional field, that is insufficient. The method space is too large relative to what DIDs are today, and we should resist adding to that pile without a strong value proposition.

ewelton on 26 Mar 2020

Thanks, @ewelton . That is a sound argument but, going back to my original argument of _DIDs for everything_ in a decentralised network which allows us to move into a synergistic future with better naming conventions and smarter identifiers, I'm keen to keep investigating.

@kdenhartog - Are you able to answer Eric's first question ...

1.) How is control surrender enforced?

@mitfik - Are you able to answer Eric's second question ...

2.) How does resolution work (e.g. what is the relationship w/ an underlying registry)?

Let's hammer that out before coming back to method naming.

Just so I don't have to scroll back later on, can someone also give me a definitive answer on whether a DID method type should depict a function or a target? Thanks.

pknowl on 26 Mar 2020

We've said a lot of words here. I have tried to keep this brief (and failed). HOWEVER, I am responding with a different illustration of what I see as the defining mismatch between content-based identifiers and DIDs.

This thread has shifted my sense of how we communicate what a DID is. Regardless of whether was adopt this new kind of DID as something we, as a standards effort want to incorporate, we should definitely update the language in the spec so the mismatch can be minimized for future readers. People have a hard time understanding how DIDs do what they do, which is vital to understand if they are appropriate for a given reader's needs. However the technical questions resolve, we definitely have a documentation problem.

Here's what clicked for me as I was trying to understand how we are talking past each other.

DIDs are a framework for cryptographically proving control over an identifier without relying on a trusted third party.

This is what's new. This is what's different.

This proposal to "nuance" our mental model abandons that and would create a new class of DID which is essentially uninteroperable with other DIDs. I'll call these CIDs for content identifiers, which have all the characteristics described by others. As I've stated many times, they sound awesome. They will be useful. It makes sense to standardize a way to use them.

Consider the use cases document:
https://w3c.github.io/did-use-cases/

First, two of the first four essential characteristics of DIDs are not met by CIDs:

crytopgraphically verifiable: it should be possible to prove control of the identifier cryptographically;

resolvable: it should be possible to discover metadata about the identifier.

3 is not met because the hash provides NO way to demonstrate control. It only demonstrates knowledge of the associated content.

4 is not met because there is no derivable meta-data about the identifier. A CID has no mechanism to lead you to additional details that would allow the core functionality that define DIDs. In particular, there is no way to bootstrap a control framework just from a hash.

Maybe I'm missing something on #4, but to my understanding, revealed knowledge cannot establish control in the way that secret knowledge can. If you must reveal the knowledge to satisfy the cryptography, as you do with hashes, you cannot prove anything cryptographically without ceding equivalent control to the recipient of the proof. It's a leaks control and therefore isn't suitable as a control framework.

Second, of the 13 actions enabled by DIDs, only the first two are supported by CIDs:

3.1 Create
3.2 Present
3.3 Authenticate
3.4 Sign
3.5 Resolve
3.6 Dereference
3.7 Verify Signature
3.8 Rotate
3.9 Modify Service Endpoint
3.10 Forward / Migrate
3.11 Recover
3.12 Audit
3.13 Deactivate

CIDs can't be used to Authenticate, Sign, Resolve, Dereference, Verify Signature, Rotate, Modify Service Endpoint, Forward / Migrate, Recover, Audit, or Deactivate.

Third, the reason DIDs are useful in decentralized identity is precisely because of the ability to demonstrate control. Not because they identify only a particular class of thing or because they can disambiguate anything.

(FWIW, even @dhh's second definition of disambiguate wrt Alice's definition is unknowable and unprovable. Because people other than Alice can use the DID as a subject without getting confirmation from Alice that they are using it in the way that she means it. And even if they did, there is still the risk of semantic drift as Alice's sense of what she means evolves over time.)

The way DIDs bootstrap digital identity, in the most typical use case where Subject==Holder==Controller (whether or not the issuer is identified by DID) is as follows:

Two stages.

First, you get the credential.
Stage I

You onboard at an issuer--they first prove who you are to their satisfaction.
You prove control over a given DID (often called DID-AUTH) using the secret associated with the cryptographic material specified in the DID Document
The issuer generates a VC with that DID as the subject and gives it to you, signing it in a provable manner.

Second, you use the credential.
Stage II

A Verifier presents a challenge in a request for a credential
You construct a Verifiable Presentation which includes both the challenge and the VC, signed by the same secret material used to prove control in Stage I.2
The Verifier checks that the holder and subject are identified by the same DID.
The Verifier checks that the presentation (with the challenge) is signed with secret material indicated by implication in the DID document. Most commonly, the VP is proven to be signed by a private key that matches a public key in the DID Document.
The Verifier checks that the signature of the credential matches the known cryptographic material from the issuer (this can be from a DID Document or from any other pre-arranged mechanism to exchange keys or the like).

At this point, the Verifier knows that the current presenter of the VC has proven control over the same secret information as the subject, and therefore, with a specific level of assurance they can accept that the current presenter is one of the following:

the Subject
a delegate of the subject with cryptographic authorization (someone who has control over a proof mechanism listed in the authentication section of the DID Document or who simply has been given the private keys of the Subject for this purpose)
a bad actor who has compromised the keys (or proof mechanism) of the Subject

We always have to allow for #3. That's the weakness in the system. However, the entirety of modern cryptography has this weakness, which is why keys MUST be kept secret if they are to have any use whatsoever.

It is the ability to perform this proof of control that ties the issuance of a VC to its presentation so that a Verifier can have some proof that the party presenting the credential is, in fact, the entity given that credential, which to the best knowledge of the issuer was believed to be the subject of that credential.

You could, of course, use a third party to demonstrate proof of control. You just ask Facebook who they believe is the current presenter. They'll use their own authentication approach then present their result. The whole point of DIDs is to enable this sort of bootstrapping of verifiability WITHOUT relying on the likes of Facebook. That's what makes DIDs unique and valuable.

CIDs can't be used in this fashion. As such, they just don't do--CAN'T DO--the fundamental thing that DIDs were created to do.

Yes, we can attempt to interpret the "decentralized" part of the DID name in the hope of supporting all the kinds of identifiers that can be rigorously created without a trusted third party, but, when we can't even agree on the meaning of the word "decentralized", that seems like a particular kind of madness. No offense to @dhh1128 @pknowl @ewelton or any other proponents of this idea. It's just that shoehorning an incompatible, non-interoparable notion of DIDs because of lexical similarity with an ill-defined term just doesn't stack up for me.

That said, I do like CIDs. They have been implemented as URNs in several forms from urn:hash to urn:sha. The particular variation proposed here might deserve its own namespace, such as urn:cid or perhaps if it builds on multihash, urn:multihash.

However, since

you can't use CIDs to perform proof of control to bootstrap decentralized identity in the way describe above
CIDs lack 2 of out 4 "essential" characteristics of a DID
you can't use CIDs to perform 11 of 13 actions of DIDs as captured in the Use Cases and Requirements document

I can't help but come to the conclusion that CIDs are not DIDs.

If it doesn't look like a duck and doesn't quack like a duck, it's probably not a duck.

It might be a bird. It might taste delightful when prepared in the Peking style, but it still probably isn't a duck.

jandrieu on 26 Mar 2020

The did:o: identifiers would not sit in any identity registries.

There is some precedence for this. The DNS RFCs specifically exclude the .onion root domain (and a few others) from fully complying with the DNS standard. See specifically https://tools.ietf.org/html/rfc7686

-- Christopher Allen

ChristopherA on 26 Mar 2020

I'm sorry, is the proposal here to have a did:o namespace that then has multiple methods underneath it?

For example

did:o:sha:123...
did:o:multihash:abc...
did:o:myHash:xyz

Is that was you're suggesting @pknowl?

jandrieu on 26 Mar 2020

@jandrieu I want to clarify - I am not a proponent of adding content based identifiers into the current model of DIDs. This is because of the two reasons I enumerated - lack of solution to resolution, and no way to fully "surrender control" - and "reproducing" simple urns but calling them DIDs is silly - and besides did:o:sha:123 doesn't assist resolution at all, because it is missing location information.

One of the mistakes made in the DID model is the strange handling of resolution - DIDs contain some location information but rely on a bunch of secret hidden magic to make them resolvable. Resolution is critical, and leaving it out of scope is just part of what I consider "a long series of mistakes" beginning around mid 2019.

Current DIDs have become defined the way you define them as the result of evolution of the community. DIDs were more open to flexibility and interpretation in the past. Alternative approaches to DIDs lost out in the sea of privacy, control, and decentralization voices - and that is fine. The rubric idea became myopically focused on decentralization, so we lost most of the structure for navigating the alternatives. The use cases became focused on what I consider a niche world. The collapse of semantic flexibility meant we got onto the road of "the one true DID"

So, to be clear - I believe that there are legitimate use cases for these sorts of "non-controlled" and "verifiable" content-based identifiers. And I believe that 1 year ago would have been a great time to sweep them into DIDland so that we could build them into the resolution infrastructure. And I believe that the flexible semantics we had 1 year ago gave a very clean path to model this larger landscape inclusively and to the benefit of the global community.

However, as of today, DIDs are more focused - they are much more specific thing, and that means that a spec will be produced and we'll get some nifty tools out. It also means that I think that getting these sorts of capabilities into the DID landscape, for the goals @pknowl identifiers, might not be viable today - the window has closed and it is time to work with the DIDs we have, not the DIDs we want. Maybe there is a way to shoehorn them into the authoritative model of DIDness, but it will take a cleverer person than me to do it.

Don't get me wrong - there has been a lot of great work and thought behind DIDs-of-today - but DIDs are neither revealed truth nor natural law, they are the result of a negotiated specification that reflects the loudest and most energetic voices. Since those have focused on privacy paternalism, control, anti-correlation, and a particular interpretation of decentralization - that is what we have. I am excited to see a lot of the work that is going on, but these DIDs are just not that relevant to my use cases - there are alternatives which I can use today to deliver "improved-sovereignty" and "improved government and business processes" through the use of non-DID grounded credentials and capabilities. When DIDs are mature and in broad adoption, it will be easy to incorporate them into my world and further improve sovereignty - and I am looking forward to that.

What makes DIDs strong for some people, make them weak for others - and that is normal. What is most important is that the spec stabilizes and is released. There is always room for adaptation in the next round of specs, and via alternative specs - so I support this effort to the extent that it does not derail or retard the delivery of a clear specification - whatever it winds up saying.

ewelton on 26 Mar 2020

Many thanks for pointing me to that link, @ChristopherA . Very much appreciated.

@jandrieu - For our purposes, we're not interested in location, we just need to know that the content is immutable. Perhaps resolution characteristics and MIME-type would be held in the associated DID document. I would expect the did:o: namespace to be very simple ...

did:o:<hashofcontent>

For example, if a non-governed object were moved from Drive A to Drive B, the identifier should remain the same even though the location has changed.

@mitfik will certainly have some deeper insight into requirements and resolution.

pknowl on 26 Mar 2020

@ewelton - I'm also acutely aware that if we get the naming convention right at this stage for non-governed objects, the Semantics side of the model would remain stable despite the release of future versions of the DID specification. This is just as much about sustainability to the network going forward as it is to non-governed objects requiring a stable identifier under the DID umbrella.

pknowl on 26 Mar 2020

Actually, the precedence of allowing for some “special purpose domains” that do not need to fully adhere to the DNS RFCs is described more fully in Section 3 of RFC 6761.

https://tools.ietf.org/html/rfc6761#section-3

The .onion domain RFC https://tools.ietf.org/html/rfc7686 describes more why this top level domain meets the
criteria.

I’d like to suggest that we support a similar carve out (like in RFC 6761) for how to register a “special purpose method”, but specifically do not add to our agenda to tackle specifying the nature of any such method.

This allows the did:o, etc. people to proceed with their ideas, and allows others others who do not meet the full criteria of the 1.0 standard to still be able experiment.

For could begin with registering those method that don’t support full CRUD by marking them as “special purpose method” in the registry, and the method only has to show why they qualify as such a method.

— Christopher Allen

ChristopherA on 26 Mar 2020

🚀1

@ChristopherA That does seem like a particularly useful way of sorting out some of the "stranger" methods, and perhaps keeping the door open a crack for at least playing around with novel ideas. If some of those ideas catch hold, they could make it into an future version of the spec itself - but they do not have to challenge the progress achieved by focusing DIDs, and they do not need to distract by requiring additions to the use cases.

+1 !

ewelton on 26 Mar 2020

@kdenhartog

For example, I believe that did:atom:carbon is valid today [..]

I agree with your comment.

Just wanted to point out that there's an interesting difference between did:atom:carbon and did:atom:6. In the second example ("6"), the identifier is an _"intrinsic property that is objectively observable"_ (quoting @dhh1128 here), whereas in the first example ("carbon"), that is not the case.

peacekeeper on 28 Mar 2020

I've gone quiet on this long thread that I started, but I wanted to say thank you to all the smart people who chimed in.

Re. the final pair of comments from @kdenhartog and @peacekeeper : yes to the distinction Markus was trying to highlight. When you have a property that is objectively observable as the basis of an identifier, and everybody knows what property to look for, then you have the interesting phenomenon that multiple observers will automatically be led to agree on the identifier for the object -- even for new objects not yet discovered. This has some very desirable benefits in a decentralized ecosystem. Perhaps Joe is right that this doesn't belong inside the DID umbrella; I'm content to let consensus rule, but just wanted to make the strongest case I could for it.

As the original opener of the issue, I am happy enough with the ensuing discussion to let it be closed now. But we can also keep it open longer if procedure or the preferences of others pushes us that way.

dhh1128 on 28 Mar 2020

I think for those who would like to update the mental model in ways that have been discussed in this thread, a concrete next step would be to:

Propose that the "create" operation be made optional, just like a while ago we made "update" and "deactivate" optional, OR:
Demonstrate in some draft version of a DID method spec how the "create" operation would be defined.

peacekeeper on 28 Mar 2020

👍1

@peacekeeper A DID using this method-to-be-named would still have a definition of the Create operation, no? It's just that the Create operation in the DID method spec would describe the special way in which DIDs using this method are created.

RE naming, I thought the original proposal was for DIDs using this method to use the multihash format. If so, why not just call it did:multihash:.

talltree on 29 Mar 2020

@talltree I'm keen to name this method type did:o:, a name that can be cast in stone unhindered by future revisions to DID specifications and methodology. An "object is an object" so why not be bold from the outset.

The other argument for sticking with the "O" method type is that there will be a huge number of these identifiers woven into the fabric of the decentralized network. 50% of all identifiers (i.e. anything non-governed within the _data capture_ side of the model) will contain this method type. To help people digest, adopt and ultimately scale this new identifier type, users could simply refer to them as "DID-Os".

pknowl on 29 Mar 2020

+1 to did:multihash over both did:immutable and did:o. The method name should be a hint to how the DIDs are created and resolved, rather than indicating what is being identified.

I think this is another interesting aspect in this thread. Almost all DID methods I am aware of don't restrict what is being identified. This one seems to have such a restriction, i.e. it can only identify what can be hashed.

peacekeeper on 29 Mar 2020

👍1

@peacekeeper I suppose the method name should reflect how the community sees the DID space evolving. I, for one, hope that the argument for the development of did:e: (entity identifiers) and did:o: (object identifiers) will be supported by the DIDWG in the future. I'm not saying we need to get there tomorrow but, now that a light has been shone, it will be difficult to ignore.

We have a rare opportunity to name the object identifier correctly right off the bat whilst hinting at an elegant DID syntax evolution for the future. Why wait for governed identifiers to align to the methodology. If the identifier name is set to did:multihash:, it will inevitably have to be renamed to did:o: in the future.

If I'm missing something and did:multihash: will simply be easier to get over the line for DID v1.0 then I'll concede for the greater good but that shouldn't stop the DIDWG from investigating did:e:/did:o: further upstream in a bid to resolve the potential method-type scaling issue highlighted in this thread.

pknowl on 29 Mar 2020

👍1

@mitfik has just messaged me saying that he has a feeling that a non-governed object identifier may need to contain more than just a simple 'multihash'. On that note, I propose that the community hold off on a casting vote until the tech guys have had a chance to further investigate what identifier characteristics should be included.

pknowl on 30 Mar 2020

@peacekeeper

I think this is another interesting aspect in this thread. Almost all DID methods I am aware of don't restrict what is being identified. This one seems to have such a restriction, i.e. it can only identify what can be hashed.

This is critical as I see it, because it is the presence of a controller that defines the semantic space within which the identified exists. I see that as a key strength of controlled DIDs. When you and I talk about the same thing using different DIDs, the only way that can coordinate is by presenting evidence from attached and found information - external claims, credentials, and the like which are linked to the controlled document. That is very valuable, however....

The reason these were of interest was that, like urn:multihash:1234 there is a restrction on what is identified - namely that which can be hashed. It is this property that allows them nearly zero semantic ambiguity - down around 1 in 2^80 or above range - tweakable by the hash, of course. This means that we can talk about the same thing, using an identifier, without pinning it on a negotiation.

This is useful, for example, when pointing to a credential schema or context or other primitive from which one scaffolds deterministic processing in a decentralized data economy - it provides an "open authority" without simply using DIDs to create "a new root of central authority." I find the concept of a Bitcoin Anchored Semantic every bit as Centrally Controlled as schema.org.

Hashlinks give us a lot of the power needed - and in particular they give us the thing that is missing from simply using did:whatnot:<hash> - namely, hints about location and thus a pathway to resolution. What nothing gives us yet is a specification about what sort of descriptor could come back, and that definitely has value - giving programmers a coordination point that was not bound to specific implementations, but bound to the concept of uncontrolled, self-certifying identifiers.

I also remain concerned about the maintenance of hidden control - the 'create' method would effectively be a 'register' method - but register it in what infrastructure? - which gets, again, to resolution. And it is the infrastructure of the registry which defines the possibility of true "surrender of control" vs. "good samaritan waiving" - i think it makes sense to wait to name this concept until those elements are clear:

how do create/register
how does read work
how is control surrender enforced

if we can not do these, then we have defined something equivalent to regular DIDs with a claim "this DID that I control is about urn:multihash:1234" - and those DIDs are fine, but they can not be the foundation for scaffolding semantic processing on a decentralized data economy - for that we need a decentralized identifier with broader capabilities than DIDs.

ewelton on 30 Mar 2020

I think for those who would like to update the mental model in ways that have been discussed in this thread, a concrete next step would be to:
* Propose that the "create" operation be made optional, just like a while ago we made "update" and "deactivate" optional, OR:

* Demonstrate in some draft version of a DID method spec how the "create" operation would be defined.

I'd say there's probably a few things we could take from this thread as well to make as additions to the did core spec. Some of the arguments against this method have pointed to a few things that are left as tribal knowledge that I'm wondering if we could get normative, testable statements for.

For example, one of @jandrieu point I felt was a pretty strong point. On creation of a DID it SHOULD (could be upgraded to MUST) be possible to prove limited control of the identifier via a cryptographic mechanism.

Another one I've been toying around with is the idea of a minimum number of possible namespace entries. E.g. the method specific identifier must be able to identify at least 2^80 unique identifiers. I'm not sure this really adds much enforcement to the idea of the identifier not needing an authority to authorize access to the namespace.

I also like @ewelton point about adding at least non-normative statements and normative statements if possible around surrendering control because I feel that was part of the crux of what makes this possible.

@peacekeeper do you have any ideas around other things that might be worth adding for this?

kdenhartog on 31 Mar 2020

Thanks, @ewelton . That is a sound argument but, going back to my original argument of _DIDs for everything_ in a decentralised network which allows us to move into a synergistic future with better naming conventions and smarter identifiers, I'm keen to keep investigating.

@kdenhartog - Are you able to answer Eric's first question ...

1.) How is control surrender enforced?

It's surrender at the point of creation by the intrinsic nature of the method. In other words, control of the knowledge is all that's necessary to create the method. Representation and proof of control is unnecessary after creation, just as it's unnecessary after all keys have been revoked in all other methods.

kdenhartog on 31 Mar 2020

I'm sorry, is the proposal here to have a did:o namespace that then has multiple methods underneath it?

For example
did:o:sha:123...
did:o:multihash:abc...
did:o:myHash:xyz
Is that was you're suggesting @pknowl?

I hope not, that makes the method name even more likely to centralize around a naming authority.

kdenhartog on 31 Mar 2020

👍1

I've gone quiet on this long thread that I started, but I wanted to say thank you to all the smart people who chimed in.

Re. the final pair of comments from @kdenhartog and @peacekeeper : yes to the distinction Markus was trying to highlight. When you have a property that is objectively observable as the basis of an identifier, and everybody knows what property to look for, then you have the interesting phenomenon that multiple observers will automatically be led to agree on the identifier for the object -- _even for new objects not yet discovered_. This has some very desirable benefits in a decentralized ecosystem. Perhaps Joe is right that this doesn't belong inside the DID umbrella; I'm content to let consensus rule, but just wanted to make the strongest case I could for it.

As the original opener of the issue, I am happy enough with the ensuing discussion to let it be closed now. But we can also keep it open longer if procedure or the preferences of others pushes us that way.

It looks like the author of this issue feels satisfied by the discussion that occurred. Next steps for this can go one of two ways (potentially both) I would guess. @mitfik @pknowl and I can draft a strawman did method to explore what these immutable, surrender control on creation dids would look like, or we can begin to propose language to constrain what did methods are possible.

Any opinions on which way to go?

kdenhartog on 5 Apr 2020

👍1

Thanks, @kdenhartog . I believe this is now in the capable hands of @mitfik and a couple others in the HCF tech group to start working on a strawman/draft spec. The workload has suddenly gone through the roof at this end which is why this stream has slowed down. That said, I think we have everything we need for now.

pknowl on 6 Apr 2020

I propose we close this issue then since the did method can be shared via the did method registry. Any objections?

kdenhartog on 7 Apr 2020

No activity since marked pending close, closing.

brentzundel on 17 Apr 2020

Did-core: Can we nuance our mental model on DID control slightly?

Most helpful comment

All 85 comments

3 is not met because the hash provides NO way to demonstrate control. It only demonstrates knowledge of the associated content.

4 is not met because there is no derivable meta-data about the identifier. A CID has no mechanism to lead you to additional details that would allow the core functionality that define DIDs. In particular, there is no way to bootstrap a control framework just from a hash.

Related issues