Scratch-vm: Design specification for "SB3"

Created on 19 Sep 2016  ยท  36Comments  ยท  Source: LLK/scratch-vm

Serialization Requirements

  • Serialization format must represent the program
  • Serialization format must represent the program's current state (backgrounds, variables / lists, sprites, etc.)
  • Serialization format must be able to be packaged along side assets (sounds, images, etc.)
  • Serialization format should use space efficiently

    • Compression

    • Duplication (see "lists" issue in SB2)

    • Representation (balance with "readability"?)

  • Serialization format should be able to be strongly validated (JSON schema)

    Package Requirements

  • Package format must include all information needed to replicate and execute the program without an internet connection. Possible exceptions:

    • Cloud Variables
    • Extensions
  • Package format should use space efficiently

    Considerations / Questions

  • How do we represent Scratch extensions (e.g. SB2)?

  • How do we ensure that the addition of new blocks / functionality in the future can be easily accommodated (flexibility)?
  • How do we accommodate SB3 files in our projects / asset infrastructure?
  • How do we handle issues of backwards compatibility?

    • SB2 -> SB3 conversion
    • SB3 files should throw a friendly error message to Scratch < 3.0 clients

      Related

  • GH-128

  • GH-127
  • https://github.com/llk/scratch-parser
feature

Most helpful comment

(tl;dr below)

As per this, quoting myself from there:

Remember, old Scratch extensions are also JavaScript based, not JSON based. That's a really handy thing IMO. And since Scratch 3.0 is also JavaScript based, the more direct access to an extension object we can have, the better.. with this we could, for example, even update the spec of a block from _inside_ the extension! (by doing var myBlock = {..cb: /* modifies myBlock */}; var myExtension = [myBlock];)

Also on the topic of extensions - how much access should we directly give an extension block to Scratch? In my opinion it should be the same as other blocks -- access through util, this.vm, and whatever. That way extensions can be as powerful as they should be able to - as powerful as any other block. Isn't that the point of extensions?

But I'd like to clear up some things, since I was a little rushed when I was writing that...


Remember, old Scratch extensions are also JavaScript based, not JSON based. That's a really handy thing IMO.

Basically, I'd like to be able to _set up_ things _before_ running my extension. I'd also like to have variables _local_ to the extension. That's only possible like this (sb(x)2):

// IIFE, keep variables inside local to just that function call.
void (function() {
  var myVar = 123

  ext.setMyVar = function(value) {
    myVar = value
  }

  ext.getMyVar = function() {
    return myVar
  }

  // register, etc
}())

With a "JSON" approach, I'm not sure how one could have a default value for an extension-local variable, at least not as elegantly as that.


The rest of my post is a rushed mess of stuff mostly covering how I'd like extensions to improve.. I'll try to do better here.

I'd suggest these changes:

Make menus use getter functions. Probably optionally, with the other option being an array. As an example, take this simple dynamic variable extension:

var variables = [
  {name: 'foo', value: 'bar'},
  {name: 'kaz', value: 'fluffy kittens!'}
]

// ..

menus: {
  variableOperations: ['create', 'delete']
  variableNames: function() {
    return variables.map(function(variable) {
      return variable.name
    })
  }
}

As a follow-up to the next section, I'm honestly not sure how menus work with _real_ Scratch blocks but I know that at some point (even now?) there was a variable dropdown list that dynamically contained list items based on the global and target-local variables of the selected sprite - I'd imagine what I'm going for should work something like that.

Give extension blocks as much power as you give normal blocks. In my opinion, for "basic" mods, extensions could _just work._ And the way to do that is to make them work just like normal blocks. For example, a push/pop sprite state extension:

// Just an ES6 symbol so that other extensions that might like to
// have their own 'states' property won't get interfered with by this.
var statesSymbol = new Symbol('States')

// ..

pushState: function(args, util) {
  if (!(statesSymbol in util.target)) {
    util.target[statesSymbol] = {}
  }

  // getState is a helper function that returns an object
  // containing x, y, size, and whatever other properties
  // about the target it's given.
  var state = getState(util.target)

  util.target[statesSymbol].push(state)
},

popState: function(args, util) {
  if (statesSymbol in util.target && util.target[statesSymbol].length) {
    var state = util.target[statesSymbol].pop()

    // loadState is a helper function that works like the reverse
    // of getState - it uses the object returned from getState as
    // a source of properties to return the target to.
    loadState(util.target, state)
  } else {
    throw new Error('That sprite doesn\'t have any saved states!')
  }
}

I don't think much further needs to be explained with this, it's pretty clear that one of the severe limitations of extensions is that that they don't really have any access to the sprite they're working with. (_Mostly._)

PS my mind just exploded with a very possible script-local-variables extension by manipulating data stored on things like util.stackFrame.. so that's cool!

tl;dr, Scratch extensions should behave like literal code extensions to Scratch - they should behave (as much as possible) just the same as if you had added to the original source code of Scratch 3.0.

(EDIT: Sorry for getting close to doubling the length of the page!)
(EDIT2: PPS->PS, duh)
(EDIT3: add a missing colon ๐Ÿ™ƒ )

All 36 comments

What's the state of https://github.com/llk/scratch-parser? Is it anywhere close to useable?

@BookOwl It's usable for SB2 projects. In fact, we use it for research quite a bit.

Can you please make it public? I would really like to mess around with it.

@BookOwl All set. Very much still a work-in-progress. Any contributions would be most welcome!

Wow, thank you! :D

I would love to see the ability to use Scratch Extensions offline as well. In terms of keeping file size small, it makes sense to host most extensions online, but if users want to explicitly choose to include extension files within their scratch projects, that would be a neat feature to have available.

Not sure if this is related, but one thing I would very much appreciate is the introduction of 'libraries' or 'packages' or something similar. It's very tedious to copy and paste custom blocks from one project to another and a feature of at least being able to import scripts from another project would be nice.

@birdoftheday Nice idea, but maybe better to place it in the suggestions forum or a separate issue

@thisandagain Also, wouldn't the "username" block also be an exception for using the project without internet? (not that it really affects the file design specification anyway)

Re: "Serialization format should use space efficiently" - are we aiming for efficient usage post-compression or pre-compression? For example, the duplication issue shouldn't matter if we're aiming for post-compression, and neither should the verbosity of XML. But if we're aiming for pre-compression efficiency, clearly it would :)

Nice idea, but maybe better to place it in the suggestions forum or a separate issue

@chooper100 a tad meta, but it would probably have something to do with the storage of SB3 projects :P

EDIT: Though it _is_ probably fit for another idea, to keep discussion organized.

The serialization format should also be efficient to parse, so large
projects would not crash the player or take a very long time to load.

XML is certainly not a fast nor efficient serialization format.

On Mon, Sep 19, 2016 at 1:21 PM Tim Mickel [email protected] wrote:

Re: "Serialization format should use space efficiently" - are we aiming
for efficient usage post-compression or pre-compression? For example, the
duplication issue shouldn't matter if we're aiming for post-compression,
and neither should the verbosity of XML. But if we're aiming for
pre-compression efficiency, clearly it would :)

โ€”
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/LLK/scratch-vm/issues/194#issuecomment-248059142, or mute
the thread
https://github.com/notifications/unsubscribe-auth/ADpIaYm1znjRzeLmTdj2SHroiuayOTIVks5qrsSfgaJpZM4KAfCr
.

@as-com Why not? Any benchmarks I've seen say XML parsing is about the same as JSON parsing, only milliseconds of difference. The only inefficiency is in repeated representation - but with compression, that becomes irrelevant :) Any evidence of your claim?

@tmickel Yup. Good point. I was thinking of our current issues with lists in SB2. Pre-compression this causes problems with the project upload server (as it sends along the JSON payload uncompressed) but post-compression it doesn't matter. We could upend that entire problem if the JSON payload itself were compressed.

http://www.cs.tufts.edu/comp/150IDS/final_papers/tstras01.1/FinalReport/FinalReport.html

On Mon, Sep 19, 2016 at 2:18 PM Tim Mickel [email protected] wrote:

@as-com https://github.com/as-com Why not? Any benchmarks I've seen say
XML parsing is about the same as JSON parsing, only milliseconds of
difference. The only inefficiency is in repeated representation - but with
compression, that becomes irrelevant :) Any evidence of your claim?

โ€”
You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub
https://github.com/LLK/scratch-vm/issues/194#issuecomment-248075984, or mute
the thread
https://github.com/notifications/unsubscribe-auth/ADpIaW8kDn4d56uJDZ4xhIHT4kGpvWg6ks5qrtHZgaJpZM4KAfCr
.

In case you decide to use a implementation similar to mine my implementation is available at: TheBrokenRail/scratch-vm on the branch SB3LoadSave, Note: JSON Only

It's a minor point, and I'm not sure exactly how to phrase it, but I'd like to add another goal. I'll describe it by describing the Scratch 2.0 behavior that I'd like to avoid.

The Scratch 2.0 project.json file contains integer ID numbers for each sound or costume, and another for the pen layer (treated in many ways like a costume). These ID numbers are meaningless when dealing only with the online editor, and if you look at the JSON that the editor downloads from the server these values are usually all set to -1.

These ID values are used exclusively by SB2 files on disk. The SB2 "version" of a project.json file has non-negative integers for these IDs, and those numbers are used to look up the appropriate asset file within the SB2.

Take this example:

    "sounds": [{
            "soundName": "pop",
            "soundID": 0,
            "md5": "83a9787d4cb6f3b7632b4ddfebf74367.wav",
            "sampleCount": 258,
            "rate": 11025,
            "format": ""
        }],

This means "look for 0.wav in the SB2". You know to look for 0 because of the soundID field, and you know to look for a .wav file because the md5 field ends in .wav.

In the new format, I would prefer not to include soundID or any equivalent; I'd rather have a file called 83a9787d4cb6f3b7632b4ddfebf74367.wav inside the SB3 archive. I think that's more friendly to a human trying to play with the SB3 format or a particular SB3 file, and I think it would have some side benefits in some of our command-line tools.

@cwillisf Wow that is pretty shocking actually! Definitely agree.

(tl;dr below)

As per this, quoting myself from there:

Remember, old Scratch extensions are also JavaScript based, not JSON based. That's a really handy thing IMO. And since Scratch 3.0 is also JavaScript based, the more direct access to an extension object we can have, the better.. with this we could, for example, even update the spec of a block from _inside_ the extension! (by doing var myBlock = {..cb: /* modifies myBlock */}; var myExtension = [myBlock];)

Also on the topic of extensions - how much access should we directly give an extension block to Scratch? In my opinion it should be the same as other blocks -- access through util, this.vm, and whatever. That way extensions can be as powerful as they should be able to - as powerful as any other block. Isn't that the point of extensions?

But I'd like to clear up some things, since I was a little rushed when I was writing that...


Remember, old Scratch extensions are also JavaScript based, not JSON based. That's a really handy thing IMO.

Basically, I'd like to be able to _set up_ things _before_ running my extension. I'd also like to have variables _local_ to the extension. That's only possible like this (sb(x)2):

// IIFE, keep variables inside local to just that function call.
void (function() {
  var myVar = 123

  ext.setMyVar = function(value) {
    myVar = value
  }

  ext.getMyVar = function() {
    return myVar
  }

  // register, etc
}())

With a "JSON" approach, I'm not sure how one could have a default value for an extension-local variable, at least not as elegantly as that.


The rest of my post is a rushed mess of stuff mostly covering how I'd like extensions to improve.. I'll try to do better here.

I'd suggest these changes:

Make menus use getter functions. Probably optionally, with the other option being an array. As an example, take this simple dynamic variable extension:

var variables = [
  {name: 'foo', value: 'bar'},
  {name: 'kaz', value: 'fluffy kittens!'}
]

// ..

menus: {
  variableOperations: ['create', 'delete']
  variableNames: function() {
    return variables.map(function(variable) {
      return variable.name
    })
  }
}

As a follow-up to the next section, I'm honestly not sure how menus work with _real_ Scratch blocks but I know that at some point (even now?) there was a variable dropdown list that dynamically contained list items based on the global and target-local variables of the selected sprite - I'd imagine what I'm going for should work something like that.

Give extension blocks as much power as you give normal blocks. In my opinion, for "basic" mods, extensions could _just work._ And the way to do that is to make them work just like normal blocks. For example, a push/pop sprite state extension:

// Just an ES6 symbol so that other extensions that might like to
// have their own 'states' property won't get interfered with by this.
var statesSymbol = new Symbol('States')

// ..

pushState: function(args, util) {
  if (!(statesSymbol in util.target)) {
    util.target[statesSymbol] = {}
  }

  // getState is a helper function that returns an object
  // containing x, y, size, and whatever other properties
  // about the target it's given.
  var state = getState(util.target)

  util.target[statesSymbol].push(state)
},

popState: function(args, util) {
  if (statesSymbol in util.target && util.target[statesSymbol].length) {
    var state = util.target[statesSymbol].pop()

    // loadState is a helper function that works like the reverse
    // of getState - it uses the object returned from getState as
    // a source of properties to return the target to.
    loadState(util.target, state)
  } else {
    throw new Error('That sprite doesn\'t have any saved states!')
  }
}

I don't think much further needs to be explained with this, it's pretty clear that one of the severe limitations of extensions is that that they don't really have any access to the sprite they're working with. (_Mostly._)

PS my mind just exploded with a very possible script-local-variables extension by manipulating data stored on things like util.stackFrame.. so that's cool!

tl;dr, Scratch extensions should behave like literal code extensions to Scratch - they should behave (as much as possible) just the same as if you had added to the original source code of Scratch 3.0.

(EDIT: Sorry for getting close to doubling the length of the page!)
(EDIT2: PPS->PS, duh)
(EDIT3: add a missing colon ๐Ÿ™ƒ )

Also for compression maybe sb3 files whatever they all should be gziped so that there are more space for bigger projects.

There may be a better way to compress

Something like this:

{
     "name": "A name of project",
     "targets": [
       {
         "name": "A target name",
         "costumes": [
           {
             "name": "A costume name",
             "image": "An base64 encoded image of costume"
           }
         ],
         "effects": {
           "effect_name": "effect_value"
         },
         "current_costume": "ID of current costume",
         "original_clone": {
           "x": "x",
           "y": "y",
           "blocks": [
             {
               "ID": "NUMERICAL ID of the block",
               "opcode": "opcode_spec_for_block",
               "fields": {
                 "name": "value"
               },
               "inputs": {
                 "name": "block_ID"
               },
               "next": "opt_next_block_ID"
             }
           ]
         },
         "clones": [
           { "ID":"numerical_ID", "...": "..." }
         ]
       }
     ]
   }

?
P.S. I'm working on exporter for this standard
Edit: Base32 -> Base64

@dekrain Base32 sounds a bit wasteful. Did you mean Base64?

I don't think putting assets into a JSON file is optimal: many parsers would choke on a 67 MB JSON file.

Hmm... I thought I heard somewhere Base64 is longer than Base32.

@dekrain Base64 increases filesize by 33%, base32 increases filesize by 60%.

So, what you recommend for me?

It's probably best to store the assets alongside the JSON, perhaps in a ZIP file, and reference them by filename (which might be its hash).

Ok, for now I did this:

/**
 * @fileoverview
 * An exporter for Scratch 3.0 format.
 * Scratch 3.0 format:
 * {
     "name": "A name of project",
     "targets": [
       {
         "name": "A target name",
         "costumes": [
           {
             "name": "A costume name",
             "image": "An base64 encoded image of costume"
           }
         ],
         "effects": {
           "effect_name": "effect_value"
         },
         "current_costume": "ID of current costume",
         "original_clone": {
           "x": "x",
           "y": "y",
           "blocks": [
             {
               "ID": "NUMERICAL ID of the block",
               "opcode": "opcode_spec_for_block",
               "fields": {
                 "name": "value"
               },
               "inputs": {
                 "name": "block_ID"
               },
               "next": "opt_next_block_ID"
             }
           ]
         },
         "clones": [
           { "ID":"numerical_ID", "...": "..." }
         ]
       }
     ]
   }
 */


/**
 * Exporter for SB3 format.
 * @param {!Runtime} runtime A Runtime for export project from
 * @return {?string} A JSON string representing project
 * @exception {Error} An Error thown when any exception in project has occurred
 */
function sb3export (runtime) {
    var json = Object.create(null),
        target, i = 0;
    json.name = 'new Project'; // No project name changing for now
    json.targets = [];
    for (target in runtime.targets) {
        json.targets[i] = {};
        json.targets[i].name = target.getName();
        json.targets[i].costumes = [];
        // @TODO: costumes
        json.targets[i].effects = 
    }
    return JSON.stringify(json, null, 4);
};

Something I'd like to mention in favour of referencing assets by hashes rather than base64 encoding:
Currently, the ST limits the size of the project json file to somewhere around 5MB.
Assuming this policy will continue into Scratch 3, I wouldn't want the space that could be used for scripts to be taken up by unnesecary images (eg. blank bitmaps) even if the size limits were increased.

Keeping the files separate allows a better distribution of project space allocations.

So, should I pack it just for ZIP?

Ok, I'll do it after school.

Ok, I went back. Now I can complete this.

Thanks for the thorough write-up of your thoughts on extensions @liam4 ๐Ÿ˜„

I did the base of standard here. Is this valid?

It'd be a good idea to have a compatibility mode flag inside the actual SB3 file for when you edit an SB2 project (#347).

A way to toggle compatibility mode would also be nice, for people updating their projects from Scratch 2.0 to use 3.0's 60tps.

Work from @morantsur and myself can be found here:
https://github.com/LLK/scratch-vm/compare/develop...morantsur:develop

Looks great but, how will offline files work, and can you add import/export SB3 to the playground, and you could include vector images inside the JSON, as vector images are typically pretty small.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

BeksOmega picture BeksOmega  ยท  4Comments

thisandagain picture thisandagain  ยท  5Comments

cwillisf picture cwillisf  ยท  4Comments

LiFaytheGoblin picture LiFaytheGoblin  ยท  5Comments

Mr-Dave2 picture Mr-Dave2  ยท  4Comments