Forking distinct issue from #10950
Decide if it is OK that deserialized objects are given a prototype from avsc causing a slight observable difference between plain data going in and data coming back.
- This may not be exactly what's happening here. Put together a repro and figure out exactly what causes deepEqual to fail currently. See https://github.com/Azure/azure-sdk-for-js/pull/10890#discussion_r482449414
I debugged through this and put a repro of the underlying cause below.
Summary: avsc gives objects it deserializes a prototype that causes the result returned to have 7 additional inherited enumerable properties: clone, compare, isValid, toBuffer, toString, wrap, wrapped. This is what causes chai to throw on deepEqual comparison as it considers all enumerable properties, including functions AFAICT. JSON.stringify(left) === JSON.stringify(right), which was the workaround pending investigation holds because the extra enumerable properties are functions, which JSON.stringify skips.
So now the question remains: is this OK? I am slightly concerned that consumers of schema-registry-avro would take a dependency on these functions, which might prevent us from replacing the avro serializer with another implementation due to the risk of breaking such uses. But maybe that is too paranoid. @xirzec Thoughts?
So far it appears to be non-trivial to strip these properties out (or make them non-enumerable if we prefer) as the return value can be a graph of objects where each of the objects would have them. I'm looking into whether the resolver arg to fromBuffer might be able to allow me to do this cleanly.
import * as avro from "avsc";
import { assert } from "chai";
const schema: avro.schema.RecordType = {
type: "record",
name: "User",
namespace: "com.azure.schemaregistry.samples",
fields: [
{
name: "firstName",
type: "string",
},
{
name: "lastName",
type: "string",
},
],
};
const type = avro.Type.forSchema(schema);
const value = { firstName: "Nick", lastName: "Guerrera"};
const serialized = type.toBuffer(value);
const deserialized = type.fromBuffer(serialized);
for (const key in value) {
console.log(key);
}
console.log("");
for (const key in deserialized) {
console.log(key);
}
assert.deepEqual(value, deserialized);
> node .\index.js
firstName
lastName
firstName
lastName
clone
compare
isValid
toBuffer
toString
wrap
wrapped
C:\Temp\rp\node_modules\chai\lib\chai\assertion.js:141
throw new AssertionError(msg, {
^
AssertionError: expected { Object (firstName, lastName) } to deeply equal { Object (firstName, lastName) }
at Object.<anonymous> (C:\Temp\rp\index.js:50:15)
at Module._compile (internal/modules/cjs/loader.js:1075:30)
at Object.Module._extensions..js (internal/modules/cjs/loader.js:1096:10)
at Module.load (internal/modules/cjs/loader.js:940:32)
at Function.Module._load (internal/modules/cjs/loader.js:781:14)
at Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js:72:12)
at internal/main/run_main_module.js:17:47 {
showDiff: true,
actual: { firstName: 'Nick', lastName: 'Guerrera' },
expected: User { firstName: 'Nick', lastName: 'Guerrera' }
}

Looks like this logic mostly happens here: https://github.com/mtth/avsc/blob/f3c6b1c104815bf239b08311fab1c5dc1ec1290d/lib/types.js#L2185
It's somewhat vexing to me that the library author has no option to provide plain objects.
Hi there. If you update avsc to 5.5.1, you can use an option to omit these methods:
const avroType = avro.Type.forSchema(JSON.parse(schema), {omitRecordMethods: true});
Note that decoded record values will still have a named constructor for performance reasons. If you'd like to hide this constructor as well, you can do so with a type hook. For example to copy the values into plain objects:
/** Minimal logical type which returns decoded records as plain objects. */
class PlainRecordType extends avro.types.LogicalType {
_fromValue(obj) { return {...obj}; }
_toValue(obj) { return obj; }
/** Returns a type hook wrapping all records and errors with this logical type. */
static createHook() {
const visited = new Set();
return (schema, opts) => {
const {name, type} = schema;
if ((type !== 'record' && type !== 'error') || visited.has(name)) {
return; // Fall back to default processing
}
visited.add(name);
return new PlainRecordType(schema, opts);
};
}
}
Sample usage:
const plainType = avro.Type.forSchema({
type: 'record',
name: 'Person',
fields: [{name: 'name', type: 'string'}],
}, {typeHook: PlainRecordType.createHook()});
const buf = plainType.toBuffer({name: 'Ann'});
const val = plainType.fromBuffer(buf);
console.log(val.constructor.name); // Object
You can tweak the logical type's implementation above to match the API you settle on: _fromValue generates the decoded values exposed to users, _toValue determines the data you'd like to accept when encoding. The logical type documentation has more information, including a couple examples.
@mtth Awesome, thanks!
@mtth you rock! Thank you so much.
I put up a PR with omitRecordMethods: https://github.com/Azure/azure-sdk-for-js/pull/11786
Worked like a charm. Thanks again, @mtth!
I didn't do the maneuver to drop the named constructor (yet). I wasn't sure it was necessary and there's some risk of losing perf to copying every object, right? @xirzec Do you think it's OK to leave the constructor being something else than Object?
I didn't do the maneuver to drop the named constructor (yet). I wasn't sure it was necessary and there's some risk of losing perf to copying every object, right? @xirzec Do you think it's OK to leave the constructor being something else than Object?
I'm not worried about that; I think JSON.stringify() will work fine without the Object constructor.
Happy to help. Thanks for the kind words :).
Most helpful comment
Happy to help. Thanks for the kind words :).