We've received some feedback for our Natural Language implementation from the API team. They have some concerns about our verbose option.
If you're unfamiliar with said option, it essentially collapses data that was deemed non-crucial. Using the following text, please refer to the examples below - Google is an American multinational technology company specializing in Internet-related services and products.
Non-verbose Entities
js
entities = {
organizations: [
'Google'
],
places: [
'American'
]
}
Verbose Entities
js
entities = {
organizations: [
{
name: 'Google',
type: 'ORGANIZATION',
metadata: {
wikipedia_url: 'http: * //en.wikipedia.org/wiki/Google'
},
salience: 65.137446,
mentions: [
{
text: {
content: 'Google',
beginOffset: -1
}
}
]
}
],
places: [
{
name: 'American',
type: 'LOCATION',
metadata: {
wikipedia_url: 'http: * //en.wikipedia.org/wiki/United_States'
},
salience: 13.947371,
mentions: [
{
text: {
content: 'American',
beginOffset: -1
}
}
]
}
]
}
I believe the intention for verbose was to supply the user with the information returned from the API that covered the most common use cases, however we may actually be supplying the user with too little information making the non-verbose mode impractical. I'd like to use this issue to open a line of dialogue to see if we can improve our implementation in this area and to address any concerns from the API team.
/cc @jgeewax @omaray @bjwatson @jmdobry @stephenplusplus @monattar
Wait, slow down. How did you do that accordion thing?
@callmehiphop Do you have a PR for the NL work so far?
@jmdobry details! (@stephenplusplus showed me that one)
@bjwatson I still need to send it through! However this issue shouldn't affect the outcome of that PR, we agreed that this was sorta outside the scope of what we should try and accomplish.
@callmehiphop Agreed.
My stance is to keep the verbose option. Our job is to simplify the API response, which has to cater to every possible user and use case and make it apply for the most common user and use case. The verbose option allows the user to get more, should they need its. And in all cases, they can see the raw API response which doesn't have any of our decorations or simplifications.
We don't want to misrepresent the API from the client library, but we do want to make it as approachable as possible. As the user's needs change, they can choose the response that is the most useful to them.
We understand that the library team's goal is to hide some complexities from the user, however, I argue that in this particular case, you're hiding the wrong complexity. We, in the API team, along with a large team of NL researchers, PMs, and API reviewers have distilled a broad set of possible information that we can return to the user to a small and crisp set. The final API that we've released is what we believe our users want to see. The additional simplification of that, without us reviewing, is completely at odds with all we've done.
The purpose of manual client libraries is to hide the complexities that get in the way of the user getting the response that they need easily. A well-designed API can and must already produce as crisp a response as possible, which we argue we already do. There are places, however, where a client library can help and the API can't do well. A good example in NL is hiding the complexity of encoding_type, which is the exact right complexity to hide and completely within the scope and power of client libraries (and unfortunately it's not done). Other good examples are hiding retry complexities, or authentication complexities (which the client libraries already do).
FWIW, in my personal use of the Translate, NL, Speech, and Vision APIs via google-cloud-node, I find myself always passing verbose: true.
For the rest of our APIs, we are much more than a transport layer that makes authentication easy and retries unnoticeable. Those are "musts", for sure, but they aren't our only purpose of being. If we were to just wrap the raw API, this handwritten client layer wouldn't be necessary at all. The user could use one of the auto-generated alternatives.
This discussion about NL is no different than any other discussion with another API. It's clear that you've had long discussions about what makes a proper API request and response. What I look for when wrapping an API is what value we can additionally provide to the user. And, even though it might be less, sometimes, it's exactly what they want.
The example from the first post:
entities = {
organizations: [
{
name: 'Google',
type: 'ORGANIZATION',
metadata: {
wikipedia_url: 'http: * //en.wikipedia.org/wiki/Google'
},
salience: 65.137446,
mentions: [
{
text: {
content: 'Google',
beginOffset: -1
}
}
]
}
],
places: [
{
name: 'American',
type: 'LOCATION',
metadata: {
wikipedia_url: 'http: * //en.wikipedia.org/wiki/United_States'
},
salience: 13.947371,
mentions: [
{
text: {
content: 'American',
beginOffset: -1
}
}
]
}
]
}
entities = {
organizations: [
'Google'
],
places: [
'American'
]
}
Here's what I know about users:
That's why I don't make the assumption that every use case needs to know the Wikipedia URL and text content offset. I think what they want to know, when they send in a block of text to this method called detectEntities(), is a grouped set of "entities". And, because I'm not absolutely sure that's what they want, they can tell us they need more information by setting verbose: true. And in the case that's still not what they want, they can access the apiResponse property that their callback receives.
My argument is not questioning the value of client libraries. I'm 100% certain that some APIs will be very hard to use without manual client libraries. My argument is that ML APIs and in particular NL API here, at its current stage and form, is too simple to need additional response manipulation. The form of our output is the result of customer feedbacks, EAPs, and PM/eng experience in the NLU area (good example is wiki URL that our customers asked many times and you're hiding as unnecessary to show). So, the effort of the library team is not well-spent to further wrap this library. It's already simple enough for the purposes that our users have communicated to us. The only place where I see complexity could be hidden is around encoding type.
This may no longer be true in the near future, in which case, we'll be more than happy to get manual libraries. At this current point though, we should easily be able to survive with auto-generated.
I didn't choose to hide any response item because it isn't useful, it's simply because a user can still have a functional application using the NL API without it. It's a matter of having simple defaults that represent the most likely thing the user wanted to know. That assessment varies from user to user, but it's a certainty without the items in the default response, our response wouldn't be very useful at all. That's the only distinction, and it's only so that our users have more readable code. Returning a large object wouldn't be user-friendly in comparison to how the rest of our library operates.
_the most likely thing_ the user wanted to know
I think what @monattar is saying is that _the most likely thing_ the user wants to know was already determined, as she said, by "a large team of NL researchers, PMs, and API reviewers" to be "a small and crisp set" distilled from a broad set of possible information, leading to the current NL API.
Was there any study or analysis done to determine that Node.js users in particular want an even smaller set of results than what the NL API returns?
I personally am not super familiar with how people tackle Machine Learning tasks in Node.js, but if it is primarily to consume models in web apps than perhaps the ultra lean response returned by google-cloud-node is more appropriate? On the other hand it seems that at minimum you would need the confidence values to use the results in your app's own heuristics.
There is a difference between what our API offers and what an upstream API offers. I assume every service we support is continuously molded by the same types of research the NL team has undergone. The difference is that an upstream API has to return a response that works for as many use cases as possible. If it discards any information they deem unnecessary, that can eliminate its viability for users. In the case of our API, we don't deem unnecessary, we just "sort by value". And more importantly, we always return the full API response for when users need more.
detectEntities returning { organizations: ['Google'] } is as intuitively simple as I can make that method, while still doing what the name implies. API-wide, that is the goal of every method. It's not a certainty that the user needs to know any other data about the entities, but it is certain they need that response at a minimum.
If they need more information, they can use verbose: true or just access the raw API's response:
language.detectEntities('...', function(err, entities, apiResponse) {
// entities = [ { organizations: ['Google'] } ]
// apiResponse = {... un-edited API response directly ...}
})
Are there studies, feedback collections, or any similar material that shows what's shown for verbose=false is what the users want? If yes, please link. If no, that unfortunately becomes your personal opinion/preference, and we never operate based on personal preferences. It needs to be reviewed, similar to the review process that the API usability went through.
I'll happily make any adjustments to our API when it is ordered by the owners of this project.
cc @jgeewax
Thanks. Will discuss with JJG. As the TL of all ML APIs, I can hopefully convince JJG that we should have a say in how our product is represented :) Please still link any material that you have feedback or otherwise on why you think the verbose=true must be the way it is.
Thank you and great work on the ML APIs, they've been a lot of fun to learn and use.
Regarding research materials, I wish I had them. I can only use our issue tracker, StackOverflow, and Twitter searches as a gauge of how our decisions are affecting users. In the case of the verbose option (not specific to NL), I can't recall seeing it brought up outside of this issue. And because we don't do any tracking of the options users are setting, I unfortunately don't have any numbers on how many or how often the default value is being updated.
Hi folks,
On the one hand, I tend to agree that the non-verbose way being the default, while great for demos, is not all that useful in practice.
On the other hand, without a user-study showing otherwise, I have a hard time believing that customers want to see the Wikipedia URL for every entity that shows up. If that was done and we found that the metadata (right now only Wikipedia articles AFAICT) was one of those things that customers saw as critical, I'm happy to eat my words :)
That said, I do agree that the salience missing is a big deal for me (and why I always turn on verbose mode...), so even in the non-verbose version, hiding salience seems silly (most customers would want to check "was this actually relevant or just a side mention" rather than "tell me everything mentioned").
Further, the automagic categorization of entities based on the type seems to be ... iffy as well.
Ideally, what would everyone think if we were to do something like the following:
Default
language.detectEntities('...', function(err, entities, apiResponse) {
/* entities = [ {
name: 'Google',
type: 'ORGANIZATION',
salience: 65.137446,
} ]
Notably missing: metadata and mentions.
*/
})
Verbose
language.detectEntities('...', {verbose: true}, function(err, entities, apiResponse) {
/* entities = [ {
name: 'Google',
type: 'ORGANIZATION',
salience: 65.137446,
metadata: {
wikipedia_url: 'http: * //en.wikipedia.org/wiki/Google'
},
mentions: [
{
text: {
content: 'Google',
beginOffset: -1
}
}
]
} ]
*/
})
"Names only"
language.detectEntityNames('...', function(err, entities, apiResponse) {
// entities = ['Google']
});
The other (and I suspect more acceptable alternative to the NL team) would be to use the verbose example above as the default and the default example above being {simplified: true}.
language.detectEntities('...', {simplified: true}, function(err, entities, apiResponse) {
/* entities = [ {
name: 'Google',
type: 'ORGANIZATION',
salience: 65.137446,
} ]
"Simplified" doesn't include metadata and mentions.
*/
})
Further, the automagic categorization of entities based on the type seems to be ... iffy as well.
Why? Can it be improved before we cut it completely?
Between the choices above, I think always returning the verbose sample and removing the verbose option is the best for the user. Currently, enabling verbose switches the data structure from an array of strings to a detailed object. In the example above, both cases return an object, one just has more data. In that scenario, I'd rather let the user have everything, because the convenience was hinged upon the simpler data structure.
getEntityNames() is a good solution, but it is preferential over the other values, so users might expect getEntityTypes(), etc.
So overall, I think if we are set on always returning an object, let's return the verbose: true version and let the user do their own mapping when they need simpler information. If we require them to do their own grouping as well, those would look like:
language.detectEntities('...', function(err, entities) {
var entityNames = entities.map(function(entity) {
return entity.name
})
// ['Google']
var organizations = entities.map(function(entity) {
return entity.type === 'ORGANIZATION'
})
// organzations = [{ name: 'Google', type: 'ORGANIZATION' }]
var groupedEntities = entities.reduce(function(groupedEntities, entity) {
groupedEntities[entity.type] = groupedEntities[entity.type] || []
groupedEntities[entity.type].push(entity)
return groupedEntities
}, {})
// groupedEntities = {
// organizations: [{ name: 'Google', type: 'ORGANIZATION', ... }]
// }
})
Can we add those as custom methods in the object to make life easier for
people?
We don't have a precedent for exposing helper methods like that, but we have some options:
// #1: already built for the user, attached as properties on the response array
// pros: easy, user doesn't have to learn a new method
// cons: could be unexpected that the array is decorated like an object
language.detectEntities('...', function(err, entities) {
entities = [ { type: '...', name: '...' } ]
entities.organizations = [ /* just orgs */ ]
})
// #2: require the user to build these convenience structures
// pros: doesn't treat the response array as an object
// cons: extra step, another method to discover/learn/remember
language.detectEntities('...', function(err, entities) {
var groupedEntities = language.groupByType(entities)
groupedEntities.organizations = [ /* just orgs */ ]
})
// #3: ??
// @callmehiphop
OK, well regardless, step one is to make verbose mode here the default.
Let's discuss adding helpers later on. Sound good ?
Sounds good 馃憤
Most helpful comment
Wait, slow down. How did you do that accordion thing?