I find the definition of formatting context in the spec a bit unclear.
1.
A formatting context is the environment into which a set of related boxes are laid out.
What is an "environment"? This is one of those typical blanket words. Without giving a definition of environment, this sentence is not saying much. I'll take for now FC as synonym with "box layout", where layout is a visual arrangement.
2.
A box either establishes a new independent formatting context or continues the formatting context of its containing block.
So each box establishes a formatting context (defines a layout) for its descendant boxes. This makes sense, since its descendant subtree must be laid out somehow. But what does "continue a formatting context" mean here? How can one box nested inside another can continue in the identical layout the other is in? After all the nested box is visually "inside" the other box. Does "continue" rather mean the box establishes a new formatting context of the same type as the one of its containing block?
Also I'm not sure if "containing block" is correct here. It's defined in the specific layout "Positioned Layout" which is one type of layout of many (Flow / Block-Inline, Grid, Flex, Table), so using it as part of the definition of layouts in general is kind of circular.
3.
The type of formatting context established by the box is determined by its inner display type
An inner display type is defined for an element, not for a box. The inner display type may even generate multiple boxes with each different formatting contexts, like for display: table.
As an aside, there are more definitions that are circular, like for example of "inline box"
A non-replaced inline-level box [box that participates in inline formatting context] whose inner display type is
flow.
Ok, same problem to above, a box doesn't have a display type, the associated element does. But let's read on what inner display type flow does.
If its outer display type is inline [..], and it is participating in a block or inline formatting context, then it generates an inline box.
Ugh. Am I the only one that has a hard time making sense of the current spec?
How can one box nested inside another can continue in the identical layout the other is in?
Fragmentation. When a node is split into mulltiple boxes at the end of a line, page, column or region.
"Containing block" is a term that originates from CSS2 - a spec which is still useful to give a high-level overview of the process (something which, inevitably, has been a bit lost with the modularization of CSS3)
An inner display type is defined for an element, not for a box.
It's defined for both. But CSS layout primarily concerns itself with boxes.
I'm not sure what point you're trying to get at with all this - some of the definitions are a bit complicated, but CSS layout is complicated. If you're suggesting improvements to the wording, great, although I'm not clear what they are. And if you're trying to pick up the CSS display or box modules and expect to read it and fully understand what's going on, it's not going to happen. I think CSS2.1 is still a better resource for this - primarily because it has a start, middle and end rather than a collection of modules.
Thanks, I'll read more into fragmentation.
You're right, it is far from being easy to comprehend. Although from the parts of CSS2 I read while studying CSS layout I found CSS3 to be quite a bit clearer.
I think to clearly define all terms would help a great deal to improve clarity. The Glossary in css-display-3 is a first attempt at this, although it's still kind of a copy-paste from the relevant CSS2 sections and contains all of these circular definitions.
I'd be happy to improve the wording myself if I understood the whole thing but I don't. Until then, the point of my post can be seen as a readers feedback.
How can one box nested inside another can continue in the identical layout the other is in?
For example, subgrids
#foo { display: grid }
#bar { display: grid; grid-template-columns: subgrid }
md5-5fa65322f291a43e5d2bf84e5b763abd
md5-259ff38cd3fc67da83b16c8c6d8ffcac
Then `#foo` is a block container and establishes an inline formatting context (IFC). `#bar` is an inline which participates in that IFC. `#bar` does not establish any FC of its own, so `#baz` participates in the IFC established by `#foo`.
</details>
<details><summary>Another example: nested blocks.</summary>
md5-50ec9749087e92317ef25e705a552b5a
```html
md5-3371cd7fc9ff61957b2149692b63a4ec
Then #foo is a block container and establishes a block formatting context (BFC). #bar is a block which participates in that BFC. #bar does not establish any FC of its own, so #baz participates in the BFC established by #foo.
I disagree with "fragmentation". Fragmentation is relevant to the fragment tree, here we are talking about the box tree.
Also I'm not sure if "containing block" is correct here.
Seems correct. It can't be "parent box" because you can have a block inside an inline. The inline is participating in an inline formatting context (IFC), but the block is participating in a block formatting context (BFC). And the IFC may be established by a descendant of the BFC root.
Well, technically the containing block is just a rectangle, but there is handwaving: "If properties of a containing block are referenced, they reference the values on the box that generated the containing block."
An inner display type is defined for an element, not for a box
Oh, see #1480. The outcome was good enough for me, but you seem more purist :)
there are more definitions that are circular, like for example of "inline box"
Note that the proper place for defining "inline box" is the Flow Layout spec, which doesn't exist yet. CSS Display is just trying to provide a quick summary.
@Loirooriol ah yes sorry, you're right. It's not fragmentation, I misread the original text.
Thanks @Loirooriol, I really appreciate the examples.
To be able for me to understand any of it, it's logically necessary to first establish a solid definition of formatting context, as the given definition as "environment" is not clear. I'm kind of fishing in the dark, since without the proper definition, I can't understand it, and without understanding it, I can't come up with the proper definition. It's like finding the mathematical function that maps each input to the correct output given only the input-output pairs.
I've been trying out a few definitions and want to share my latest attempt. It's not fully watertight though as there are still places in the spec which it can't explain. Please correct me, wherever my reasoning is wrong.
A formatting context is a particular box layout applied to a set of boxes.
where "box layout" is Flow Layout, Flex Layout, Grid Layout, Table Layout, etc. as defined in the different Layout modules.
A formatting context can be thought of as "way to arrange a given set of boxes". Mathematically it can be thought of as a function which takes a set of boxes and assigns them coordinates and sizes.
It needs to be defined how a formatting context treats nested boxes. Most certainly we want to keep the boxes in the same logical order as they are nested in the box tree. For example in an inline formatting context, we want the following to be laid out as foo1 bar1 bazbaz kikkik bar2 foo2.
<span id="foo">
  foo1
    <span id="bar">
      bar1
        <span id="baz">bazbaz</span>
        <span id="kik">kikkik</span>
      bar2
    </span>
  foo2
</span>
How would the formatting context define this order? What makes "bazbaz" come before "bar2" or "foo2"?
To solve this, we let each box define a formatting context. An formatting context applies only to the immediate children. It deals only with the flat set of boxes of box tree depth 0 and leaves any subtree that might be contained in it to the formatting contexts of the nested boxes.
So the formatting context of foo only lays out "foo1", bar and "foo2". Then bar creates a formatting context which lays out "bar1", baz, kik and "bar2". And so on recursively.
The layout of boxes in a given formatting context is confined by the box that establishes the parent formatting context. Essentially, a given formatting context doesn't see the outside world. All it knows is the position constraint within the box of the parent formatting context.
The formatting contexts are nested in the same way as the box tree.
The advantage of this definition of the term "formatting context" is that it makes it compatible with the definitions of the different existing formatting context. For example, the block formatting context reads
In a block formatting context, boxes are laid out one after the other, vertically, beginning at the top of a containing block.
Similarly, the inline formatting context says "horizontally" instead. There is no notion of nested boxes. By eliminating the need for the formatting context to know how to handle nested boxes, the term becomes well defined. For this it was necessary that each box creates a formatting context, and this formatting context applies only to its immediate children.
Also there is no notion of "continuing an existing" formatting context vs. "creating a new" formatting context anymore. Each box creates a formatting context, which might or might not be the same as the parent formatting context.
In the above example each foo, bar, baz and kik create an inline formatting context. So foo creates a formatting context in which it contains first "foo1" then bar then "foo2". Then bar creates a new formatting context in which it contains first "bar1" then baz and kik and then "bar2". Finally baz and kik each create a new formatting context which contain "bazbaz" and "kikkik" respectively.
Since each formatting context is confined to the parent formatting context, this explains why nested boxes are confined by boxes from further up in the box tree. In the example, "bazbaz" can never come after "bar2" or "foo2", because the formatting context created by baz is confined to the one created by bar which is confined to the one created by foo.
Visually I like to think about the formatting contexts being nested in the same way as the boxes.

A formatting context can be thought of as "way to arrange a given set of boxes"
Yes. I would say that a formatting context are the rules that dictate how to lay out descendant boxes.
a given formatting context doesn't see the outside world
Yes, and that's why we can't have all boxes establish an independent formatting context.
<div style="float: left">float</div>
<div id="parent" style="border: 3px solid blue">
  <div id="child" style="border: 3px solid green">text</div>
</div>
Note that the float shrinks the line boxes of #child. But #child is not a sibling of the float, it's inside #parent. So if #parent established an independent formatting context, #child could not be affected by the float.
Thanks @Loirooriol, I like your definition of a formatting context. I guess I'm not fully on the right track yet. May I ask a few more questions to better understand it? I added quotes from the relevant parts of the spec.
With "additional boxes" I mean the boxes an element may generate in addition to its principal box.
for each element, CSS generates zero or more boxes as specified by that element鈥檚 display property. Typically, an element generates a single box, the principal box, which represents itself and contains its contents in the box tree. However, some display values (e.g.
display: list-item) generate more than one box (e.g. a principal block box and a child marker box).
Are these additional boxes always nested in the principal box in the box tree? Or can they be a sibling to the principal box in the box tree?
Otherwise the principal box could be defined as the "outermost" box generated by the element in the box tree, since any additional boxes are nested within the principal box.
A box either establishes a new independent formatting context or continues the formatting context of its containing block.
Certain properties can force a box to establish an independent formatting context in cases where it wouldn鈥檛 ordinarily. For example, making a box out-of-flow causes it to blockify as well as to establish an independent formatting context. [..] Turning a block into a scroll container will cause it to establish an independent formatting context
Assuming there are no out-of-flow boxes, i.e. no floating boxes and no positioned boxes, what is the difference between creating a new independent FC and "continuing" the parent FC? Is there an example where there is a difference?
If "continuing" a FC by a child box is possible, then it must be asked how a formatting context handles a nested set of boxes. The nested boxes must "stay inside" the parent boxes (assuming no overflow) instead of being laid out sequentially. But the current definitions of formatting context doesn't handle the case of nested boxes. For example
In a block formatting context, boxes are laid out one after the other, vertically, beginning at the top of a containing block
In an inline formatting context, boxes are laid out horizontally, one after the other, beginning at the top of a containing block.
This problem was also what I tried to highlight in my previous answer.
Note: A block container box can both establish a block formatting context and an inline formatting context simultaneously.
an inline formatting context exists within and interacts with the block formatting context of the element that establishes it, and a ruby container overlays a ruby formatting context over the inline formatting context in which its ruby base container participates.
How can two sets of "rules that dictate how to lay out descendant boxes" co-exist without destructively interfering? Can't these "co-existing" formatting contexts simply be interpreted as a single (new) formatting context, which is the union of the rules that make up the individual FCs? Like supersets so to speak?
For example, a block formatting context could be viewed as a superset of an inline FC. It allows block-level boxes to flow in block direction but also inline-level boxes to flow in inline direction by distributing them in block-level wrapper line boxes.
I think this is currently the case, yes.
the principal box could be defined as the "outermost" box generated by the element in the box tree
Seems reasonable to me.
It means that the box doesn't establish an independent formatting context, so its children will participate in the parent formatting context.
Assuming there are no out-of-flow boxes, i.e. no floating boxes and no positioned boxes, what is the difference between creating a new independent FC and "continuing" the parent FC?
Well, for example margin collapse only happens in the same BFC. See the difderences in
<div><p>The <div> continues the BFC</p></div>
<div><p>The <div> continues the BFC</p></div>
<div style="display: flow-root"><p>The <div> establishes an independent BFC</p></div>
<div style="display: flow-root"><p>The <div> establishes an independent BFC</p></div>
Well, that's because block and inline layout are two sides of the same coin (flow layout). So yes, you can consider flow layout to be a superset of block layout and inline layout.
Though I'm actually not pleased by this FC coexistence, I think that element-generated block containers should either continue their parent BFC or establish an independent BFC, but never an IFC. IFC should only be established by anonymous block containers IMO.
I think this is currently the case, yes.
I don't think that's true of the table wrapper box, nor the boxes that are created for blocks-inside-inlines, fwiw.
@Loirooriol Thanks again, these are great examples!
I also agree that a block container should strictly establish BFCs and not IFCs. The current definition is very confusing.
A block container either contains only inline-level boxes participating in an inline formatting context, or contains only block-level boxes participating in a block formatting context
(note how the outer display type was not specified, meaning it can be either inline-level or block-level). Instead of filling in the "?" in the below table, it is a general term for either entry, basically a "flow container". Not really useful.
| box name | outer display type | FC |
| - | - | - |
| inline box | inline-level | IFC |
| inline-block box | inline-level | BFC |
| block box | block-level | BFC |
| ? | block-level | IFC |
@emilio The first is still true, since the table wrapper box is the principal box and not the table grid box.
The element generates a principal table wrapper box that establishes a block formatting context, and which contains an additionally-generated table grid box that establishes a table formatting context.
Regarding the second, I don't understand what you mean. Do you mean line boxes?
@emilio Why not? The principal box is the table wrapper box, which contains the table grid box.
For blocks-inside-inlines, see the resolution in https://github.com/w3c/csswg-drafts/issues/1477#issuecomment-380771705: the behavior is sane in the box tree and the block box stays inside the inline box, which is not split (so there is a single inline box, the principal one). All the splitting mess is confined into the fragment tree.
Ah, that's true, though it's a bit unintuitive IMO (I prefer to think of the wrapper box as an anonymous box, as only a handful of styles in the grid box apply to the table box).
Re #1477, sure, but that's not how any implementation works, as far as I can tell :)
Most helpful comment
Fragmentation. When a node is split into mulltiple boxes at the end of a line, page, column or region.
"Containing block" is a term that originates from CSS2 - a spec which is still useful to give a high-level overview of the process (something which, inevitably, has been a bit lost with the modularization of CSS3)
It's defined for both. But CSS layout primarily concerns itself with boxes.
I'm not sure what point you're trying to get at with all this - some of the definitions are a bit complicated, but CSS layout is complicated. If you're suggesting improvements to the wording, great, although I'm not clear what they are. And if you're trying to pick up the CSS display or box modules and expect to read it and fully understand what's going on, it's not going to happen. I think CSS2.1 is still a better resource for this - primarily because it has a start, middle and end rather than a collection of modules.