Pysyft: REFACTOR: Serde

Created on 9 Oct 2019 · 25Comments · Source: OpenMined/PySyft

Overview

I would like to propose that we establish official standards for Serde in a way that's semi-agnostic to language. I've noticed a good many inconsistencies along the way of making my own version of Serde for syft.js. I'll mention them here as best as I can remember, but I will consider this issue to be a "catch-all" epic for all related Serde issues. This document will be a living document and will avoid problems that are related to language incompatibility (i.e. Javascript doesn't have a concept of Tuple, Dict, etc.).

Proposed Changes

Remove msg_type from all Message classes
This is also being discussed here: https://github.com/OpenMined/PySyft/issues/2649. I would also advocate that if we remove the msg_type field from the Message classes that we should also consider removing part or all of the codes.py file as it will no longer be needed.
Addressed in https://github.com/OpenMined/PySyft/pull/2687
Eliminate parentheses clouding
There are many times where parentheses are added somewhat randomly. In reality, they are not random, but rather a false attempt to create a Tuple. I've noticed many times where using () to create a Tuple inside of an existing set of parens fails to simplify. This leads to very confusing circumstances like the simplification of an Operation: (31, (1, ((6, CONTENT, OF, TUPLE)))) where the third opening paren is redundant. I've also found that this is usually resolved when using tuple() to create a Tuple instead of using parens (()).
Choose between using a List or a Tuple consistently
I've seen examples like in plan.State you have the state_ids being stored as a List, while in plan.Procedure you have arg_ids and return_ids being stored as a Tuple. Why? I'd say we should decide on using either the List or the Tuple for every circumstance unless the use-case is overwhelmingly undesired. My vote is that we use the Tuple because immutability will allow us to better find type errors in PySyft. You don't get this luxury with a List.

Skipping number 4 for now since we can use the strategy presented by @iamtrask below within syft.js
~~Cut down on the amount of classes we're using~~
There are two ways to look at this problem: we can either create a class for everything or we can create a series of high-level generic classes that allow for their internal state to change. An example of this dichotomy is in the Message classes, where we have 1 parent class (Message) which is never used except as a parent to other classes (Operation, ObjectMessage, ObjectRequestMessage, etc.). While this is very convenient in Python to have a new class for every situation, I wonder if this only serves to complicate the work of other projects like syft.js and AndroidWorker.

With every class that's added, as sub-library maintainers, we have to stub out the new class, integrate it, and properly test it. Needless to say, this is a lot of work. However, if we use high-level generic classes only, we can handle changes in "type" with an if statement internally within the class. While this may contradict a bit with the first point I made, I'll use the Message class as an example: the msg_type does tell us which type of Message is being referred to. In syft.js, I then have to create, integrate, and test 8 other classes that have been created as sub-classes. If instead we left the msg_type field in the high-level Message class, now all I need to do is have an if statement in my Message class that tells me how to handle it. That's a lot of time saved for those of us writing PySyft ports into other languages. 😄
Migrate to Proto
Speaking of version compatibility, we at the syft.js team have created Proto to mitigate a lot of the nastiness of maintaining compatibility between versions of each of the various projects. I suggest we migrate to using Proto on PySyft as soon as possible before it gets harder to do so. We have an unmerged pull request that seeks to do this: https://github.com/OpenMined/PySyft/pull/2439
Ignore Protobuf (for now)
There's a merged RFC (https://github.com/OpenMined/rfcs/blob/master/20190822-standardizing-pysyft-protocol.md) that establishes that we should migrate away from using MsgPack and instead use Google's Protobuf. Personally, I'm a fan of this, but I think it's somewhat of a distraction for now. In syft.js, we're not actually using any compression libraries (including MsgPack) and I don't intend to add any in the near future because of... well, file size. What we gain in compression is almost negligible compared to what we lose in file size with an added dependency. Granted, this is only a problem for syft.js since it's in a browser, it has to be downloaded as a web request.

Regardless of syft.js' status on this, we have no dedicated plans for migrating away from MsgPack to Protobuf. However, there are current, real-world concerns that exist with Serde. In my opinion, migrating to Protobuf is a luxury, making Serde consistent is a necessity.
Implement some sort of way to have sample simplification strings auto-generated and tests run upon version committing to dev
This one is quite complicated. However, there's no way currently for us to know when syft.js is "out of date" with PySyft (and vice-versa). It would be quite nice to have some sort of script that could be run by our CI cycle that generates new Serde simplified strings, and re-runs the syft-helpers.js CI cycle with the new data. This might be quite difficult, but at some point, it should help tremendously to anticipate possible simplify/detail bugs before our users notice.

I'll consider this issue somewhat of a moving target. But with that said, I don't feel confident in releasing syft.js to the public without establishing some sort of consistency in Serde on PySyft. We're in need of some major changes here to avoid conflicts between various syft libraries in the future.

Help Wanted Type Type

Source

cereallarceny

👍2

All 25 comments

1, 2, and 3 I wholeheartedly agree with.

As you mentioned, 4 (from a protocol standpoint) returns us back to reversing 1. Given the symmetry between 1 and 4, perhaps language can manage the different message types however is most appropriate for the language (if statements or abstract class hierarchy). Aka, PySyft could keep doing it the way it currently is, and Javascript could choose to implement a single Message class. I'm not a polyglot expert, but this seems like an appropriate flexibility given we want to allow compatibility between lots of languages, and one which wouldn't affect the abstract protocol.
Agreed.
Agreed, although I have a hunch that the compressor will be worth including in the long run (but we can wait until we can run empirical tests before going there).

iamtrask on 9 Oct 2019

4 - Similar to reversing 1. It's not entirely the same thing because currently msg_type refers to a redundant code to specify what type of message it is. This is not needed because you can already infer the message type in Python by either a) determining the instance of the class or b) reading the Serde code. Either way, you're right that we could just make our own abstract class in Javascript to keep track of these changes. I suppose though that it locks us into that decision, even at the risk of more methods being added to that class in Python. At some point, we'd have to do what PySyft is doing to avoid one really huge complex class. I suppose that's the thinking behind that mindset in PySyft anyhow. I wonder if there's some better middle-ground, or perhaps some way to prevent this from causing major typing errors in the event a class is missing in Javascript.

6 - Totally agree, I'm not saying "no". I'm simply saying "this isn't as important of a goal for the here and now".

cereallarceny on 10 Oct 2019

Hey, as the author of the protobufs RFC, just wanted to say this all makes sense to me. Any incremental improvements to the standardization of serde should (could?) help with the effort of Protobufs. Hopefully we'll see some of the development in protobufs come to fruition in the coming months.

justin1121 on 10 Oct 2019

I'm not sure what exactly you're hoping to do here, but it sounds like something Hypothesis might be able to help with.

jvmncs on 10 Oct 2019

Thanks for chiming in @justin1121 - sorry for not tagging you on the issue! I'd be thrilled for us to implement Protobuf once we reach a bit more stability and cross-compatibility with Serde. Appreciate you delaying that RFC until we make more progress here.

@jvmancuso - thanks for joining in as well. I'll look into Hypothesis more. I think what I'm looking for in particular is to actually set up a webhook with the PySyft Github repo to report when changes are made to the dev branch. A service which is triggered by the new code, then generates all possible simplification strings and stores them somewhere on the Proto repo. This could automate some of the discovery of type discrepancies in syft.js.

Basically my concern is that there will inevitably be some change to a class, or the introduction of a brand new class, into PySyft that the syft.js isn't made aware of. We know that we're definitely going to have versioning problems between PySyft and all other syft-based libraries, but this would at least report changes to the Serde protocol before our users need to report bugs. This would be testing across multiple repos and multiple languages. I'll look into this more and report back.

cereallarceny on 10 Oct 2019

UPDATE - October 14th, 2019

We're currently working on each of these issues. The following is the order in which we're doing each task above:

[x] Updating, finalizing, and merging the Proto PR that @vvmnnnkv has done. (Point 5 above)
[x] Testing current Serde implementation
[x] Getting rid of msg_type for all Message classes, and potentially removing the codes.py file assuming it’s also no longer needed. (Point 1 above)
[x] Switching all instances of List to be using Tuple (and ensure that they simplify whether they’re nested within another Tuple or by themselves). (Point 3 above)
[x] Clearing out all “magic parens” in favor of Tuple. (Point 2 above)

cereallarceny on 14 Oct 2019

To point 3 -- I don't think what you're suggesting forbids this, but want to be sure. I strongly suggest that we keep the ability to serde list -> list and tuple -> tuple. In Python, there are just some functions that will behave differently for tuples/lists, so we need to be able to distinguish on the receiving end of an RPC to enable all the functionality for those functions.

For an example, if I want to hook a function from some framework, but that function has different behavior for tuple & lists (perhaps due to immutability/hashability), the simplifying/detailing of the functions arguments will mistakenly change the behavior of the hooked function. This is because the hook_args implementation passes the args/kwargs to hooked functions through serde, so serde needs to be able to respect the difference

jvmncs on 14 Oct 2019

Totally fair point @jvmancuso. I think the concern is more related to things like a list of tensor or worker ID's. In this case, I can't see why a list would be "needed" over a tuple. But for sure, there's going to be exceptions to this rule. Do you know of specific places where this is likely to be an issue?

All that we're trying to do for that point is to establish some sort of consistency. For collections of data where the structure doesn't matter, we should prefer tuple. Nevertheless, we will still support list where needed.

cereallarceny on 15 Oct 2019

Has anyone explored extending ONNX as a possible alternative?

mccorby on 16 Oct 2019

I wasn't familiar with ONNX until doing a bit of light research @mccorby. This looks like a strong alternative. Are you proposing we would use this as a complete replacement to Serde?

My two concerns so far are:

We're really pretty close to getting Serde in a manageable condition and "if it works, don't break it".
The library for ONNX in Javascript is fairly large (but impressively, it's developed by Microsoft). I would have expected a smaller file size, but it looks to be almost as big as TensorFlow.js itself! I don't initially feel comfortable including another dependency that big as I do have to be concerned with the load time of webpages when building syft.js.

Not saying "no" - I think it's a great option - but I have a few initial reservations. Can you speak to any of these concerns?

cereallarceny on 16 Oct 2019

Go ahead. We can think about ONNX later, maybe instead of Protobuf (which is used under the hood by ONNX)
I am not sure about putting so much pressure in this particular item. I tend to look to performance at a later stage and only if necessary.

I'll try to find time to have some ONNX example up and running.

mccorby on 16 Oct 2019

Great! We will move ahead with improving Serde as is and look forward to a
demo of what it could be from you in the future!

On Wed, Oct 16, 2019 at 4:30 PM Jose A. Corbacho notifications@github.com
wrote:

Go ahead. We can think about ONNX later, maybe instead of Protobuf
(which is used under the hood by ONNX)

I am not sure about putting so much pressure in this particular
item. I tend to look to performance at a later stage and only if necessary.

I'll try to find time to have some ONNX example up and running.

—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
https://github.com/OpenMined/PySyft/issues/2654?email_source=notifications&email_token=AAJ44CVYD7SCTPTBF7KMN2TQO4XSLA5CNFSM4I7D6IHKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBM5QYI#issuecomment-542759009,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAJ44CRI4FK4NKQX3AD42TTQO4XSLANCNFSM4I7D6IHA
.

cereallarceny on 16 Oct 2019

FWIW I'm not sure ONNX would be an expressive enough DSL to accomplish everything we do with serde -- it's pretty specific to the standard ML computation graph, but there are commands & classes we have in serde that don't fit into that paradigm

protobuf seems to be the right level of abstraction we'll need in terms of expressivity/flexibility vs. standardization for the long term, and while there might be more favorable alternatives (see the "alternatives considered" section in @justin1121's RFC), I don't think ONNX is it

jvmncs on 16 Oct 2019

👍1

It seems like 1 and 5 have now been completed in PySyft, yeah? At this point, does it make sense for the workers to start trying to catch up to the serialization changes? Better to wait for the list/tuple/magic parens changes to go through?

karlhigley on 18 Nov 2019

You can do so whenever you want. I figure this will be done asynchronous for syft.js. Besides, syft workers are kind of "premade integration tests" for the Serde refactor. The sooner you implement the above, the sooner you can test against what PySyft does and if the PR's being submitted work as intended. So go ahead and get started with AndroidWorker!

cereallarceny on 18 Nov 2019

I'm making some progress on figuring out what's required to catch the AndroidWorker up to the current state of the serdes, but quickly hitting places where I wish "2. Eliminate parentheses clouding" had been completed. Not sure exactly what changed since 0.2.0a2, but I'm now seeing repetitive parens that break the message unpacking.

(I'm now also looking forward to Protobuf, so that these kinds of low-level serialization issues are easier to fix.)

karlhigley on 22 Nov 2019

Quite frankly, the parens should never have existed. They weren't being
simplified by Serde into tuples properly. It's become quite a headache to
build syft.js's Serde implementation with parens (tuples) not being
simplified as tuples.

We still need someone to sweep through PySyft and do this... If you're
interested!

On Fri, Nov 22, 2019 at 6:51 PM Karl Higley notifications@github.com
wrote:

I'm making some progress on figuring out what's required to catch the
AndroidWorker up to the current state of the serdes, but quickly hitting
places where I wish "2. Eliminate parentheses clouding" had been completed.
Not sure exactly what changed since 0.2.0a2, but I'm now seeing repetitive
parens that break the message unpacking.

(I'm now also looking forward to Protobuf, so that these kinds of
low-level serialization issues are easier to fix.)

—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
https://github.com/OpenMined/PySyft/issues/2654?email_source=notifications&email_token=AAJ44CTVZSUDYWF33XJH253QVAS2NA5CNFSM4I7D6IHKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEE6Q3KA#issuecomment-557649320,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAJ44CSFV5KEIHYJ5RODNRTQVAS2NANCNFSM4I7D6IHA
.

cereallarceny on 23 Nov 2019

The main point of #2 is that there is a difference between how tuple
literals ("()") simplify in Serde and how tuple objects ("tuple()")
simplify in Serde. As worker library authors, we keep seeing examples of
tuple literals that don't simplify.

We want the simplest possible Serde string, and we don't want parentheses
that aren't being simplified as tuples. Parens without an integer
associated with them are "magic parens". We can presume those are intended
to be tuples, but they don't simplify like tuples.

On Mon, Nov 25, 2019 at 5:18 PM Clara Bennett notifications@github.com
wrote:

I'm a little bit confused about point (2). The example given:

(31, (1, ((6, CONTENT, OF, TUPLE))))

are you saying that this is intended to be equivalent to

tuple(31, tuple(1, tuple(tuple(6, CONTENT, OF, TUPLE))))

but is actually equivalent to

tuple(31, tuple(1, tuple(6, CONTENT, OF, TUPLE)))

due to the missing comma? (because (1) == 1 while (1,) == tuple(1)) Or,
is there something else that you're trying to point at?

—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
https://github.com/OpenMined/PySyft/issues/2654?email_source=notifications&email_token=AAJ44CVX2VBDMNSCSL5UCALQVQCG3A5CNFSM4I7D6IHKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFDESRQ#issuecomment-558254406,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAJ44CX3YJJX66ETKK6O6V3QVQCG3ANCNFSM4I7D6IHA
.

cereallarceny on 26 Nov 2019

2762 reveals some inconsistencies in serde, I think these can be grouped as follows:

No simplification of values. In some cases strings, lists, tuples are just not simplified. You can find these by searching "not simplified" in test_serde_full.py
Wrapping in a non-simplified tuple. This is actually same as 1, but I wanted to mention this separately as it's somewhat semantically different from just missing simplification for a value, and is also referenced above as "magic parens". We can see this in dict type, e.g. simplifying {1: 2, 3: 4} will return: (0, ((1, 2),(3, 4))). More consistent result would be simplifying each of those key/value tuples: (0, ((6, (1, 2)), (6, (3, 4)))). Also, in Procedure: the tuple that contains simplified operations is not itself simplified.
It's not always (<CODE>, (<DATA>)). Sometimes it's (<CODE>, <DATA>). Search for "no wrapping" in test_serde_full.py to see examples, e.g. compare str ((5, (b"abc",))) with Ellipsis ((7, b"")).

I haven't tried to fix any of these inconsistencies before discussing here.

Also, I haven't analyzed List vs Tuple (mentioned by @cereallarceny) yet, but this should be simpler having simplification examples for all types in one place.

vvmnnnkv on 30 Nov 2019

Amazing work @vvmnnnkv! Do you have an opinion on wrapping vs non-wrapping? Personally I'd love to see Serde be as concise as possible, with no wrapping. What are your thoughts?

Likewise, what are the next steps so we can begin fixing the issues you've uncovered in your testing?

cereallarceny on 1 Dec 2019

I'll second the vote for making serde as concise as possible. On the AndroidWorker side, the parsing happens as much based on the position of particular elements as based on the type codes (which are sometimes skipped over entirely.)

karlhigley on 2 Dec 2019

Do you have an opinion on wrapping vs non-wrapping?

In point 2, I think it OK to leave dict as is because it's kind of "primitive" type in serde, so might be fine to do such exclusion for the sake of smaller output.
As for Procedure, it's better to simplify that tuple.
In point 3, I would make everything (<CODE>, (<DATA>)), so you can always unwrap a value in universal way: as its code and list of parameters, even when it's 1 param (because who knows if this will change in the future).

what are the next steps so we can begin fixing the issues

Need to figure out why PR fails in Travis :) I can't reproduce that locally in Ubuntu VM. Then we can address listed issues: add simplification where it's missing, etc.

vvmnnnkv on 3 Dec 2019

Let's do that. I'm sold.

On Tue, Dec 3, 2019 at 10:39 AM Vova Manannikov notifications@github.com
wrote:

Do you have an opinion on wrapping vs non-wrapping?

In point 2, I think it OK to leave dict as is because it's kind of
"primitive" type in serde, so might be fine to do such exclusion for the
sake of smaller output.
As for Procedure, it's better to simplify that tuple.
In point 3, I would make everything (, ()), so you can always unwrap a value in universal way: as its code and list of parameters, even when it's 1 param (because who knows if this will change in the future).
what are the next steps so we can begin fixing the issues Need to figure out why PR fails in Travis :) I can't reproduce that locally in Ubuntu VM. Then we can address listed issues: add simplification where it's missing, etc. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/OpenMined/PySyft/issues/2654?email_source=notifications&email_token=AAJ44CTJ3VYM3RS4E46LE53QWYZM5A5CNFSM4I7D6IHKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFY5JMA#issuecomment-561108144, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJ44CVW2I4SQVHEA2LQW3DQWYZM5ANCNFSM4I7D6IHA .


                    
                        
                            
                                
                                cereallarceny
                                on 3 Dec 2019



                                                
                    
                        @LaRiffle @iamtrask @karlhigley 

Once this PR (https://github.com/OpenMined/PySyft/pull/2812) is merged, we can close this epic of an issue.  Nice work @vvmnnnkv!

                    
                    
                        
                            
                                
                                cereallarceny
                                on 16 Dec 2019
                            
                            
                                                                                            
                        
                    
                

                                                
                    
                        Closing as the PR was merged.  @LaRiffle @karlhigley @vvmnnnkv 

                    
                    
                        
                            
                                
                                cereallarceny
                                on 16 Dec 2019
                            
                            
                                                                🎉1



            
                
                    
                        Was this page helpful?
                                                                                                    
                                                                                                                        
                                                                
                                                                
                                                                
                                                                
                                                                                    
                        0 / 5 - 0 ratings


    

        

        
            
                Related issues
                                                
                    
                         syft.exceptions.UndefinedProtocolTypeError: syft.execution.placeholder_id.PlaceholderId is not defined in the protocol file
                    
                
                
                    
                    durden62
                                         · 
                    3Comments
                                    
                 
                                                
                    
                        One-worker Bug
                    
                
                
                    
                    wentaiwu92
                                         · 
                    4Comments
                                    
                 
                                                
                    
                        Duet loop issue
                    
                
                
                    
                    mgale694
                                         · 
                    3Comments
                                    
                 
                                                
                    
                        SecureNN forces tensors to be shared within 2 workers only
                    
                
                
                    
                    aristizabal95
                                         · 
                    3Comments
                                    
                 
                                                
                    
                        installation
                    
                
                
                    
                    beatrizsmg
                                         · 
                    4Comments