Terminal: Typing inside of the default WSL terminal feels amazing, why is it better than every other app?

Created on 14 Dec 2018 · 8Comments · Source: microsoft/terminal

Sorry, this isn't an issue, but instead, it's more of a suggestion / request to please not break whatever you did with the default WSL terminal (Ubuntu specifically) being so responsive when it comes to rendering characters on the screen after pressing a key.

Typing in the default WSL terminal feels like you're typing on air. There's a smoothness to it that's not present in any other Windows app, not even notepad.exe. If it feels like it has 10ms of input lag instead of 75ms+ for all other Windows (and 200-250ms+ for most Electron based apps).

What makes the WSL terminal feel better than notepad.exe and will this UI enhancement make its way to all Windows apps in the future?

Feel free to close this if you don't want to discuss it. I mainly opened it to bring an awareness to how good it is to hopefully prevent some type of regression from happening in the future.

Issue-Question Product-Conhost

Source

nickjj

👍207 ❤28 👀10 🚀7 🎉1

Most helpful comment

I really do not mind when someone comes by and decides to tell us that we're doing a good job at something. We hear so many complaints every day that a post like this is a breath of fresh air. Thanks for your thanks!

Also, I'm happy to discuss this with you until you're utterly sick of reading it. Please ask any follow-ons you want. I thrive on blathering about my work. :P

If I had to take an educated guess as to what is making us faster than pretty much any other application on Windows at putting your text on the screen... I would say it is because that is literally our only job! Also probably because we are using darn near the oldest and lowest level APIs that Windows has to accomplish this work.

Pretty much everything else you've listed has some sort of layer or framework involved, or many, many layers and frameworks, when you start talking about Electron and Javascript. We don't.

We have one bare, super un-special window with no additional controls attached to it. We get our keys fed into us from just barely above the kernel given that we're processing them from window messages and not from some sort of eventing framework common to pretty much any other more complicated UI framework than ours (WPF, WinForms, UWP, Electron). And we dump our text straight onto the window surface using GDI's PolyTextOut with no frills.

Even notepad.exe has multiple controls on its window at the very least and is probably (I haven't looked) using some sort of library framework in the edit control to figure out its text layout (which probably is using another library framework for internationalization support...)

Of course this also means that we have trade offs. We don't support fully international text like pretty much every other application will. RTL? No go zone right now. Surrogate pairs and emoji? We're getting there but not there yet. Indic scripts? Nope.

Why are we like this? For one, conhost.exe is old as dirt. It has to use the bare metal bottom layer of everything because it was created before most of those other frameworks were created. And also it maintains as low/bottom level as possible because it is pretty much the first thing that one needs to bring up when bringing up a new operating system edition or device before you have all the nice things like frameworks or what those frameworks require to operate. Also it's written in C/C++ which is about as low and bare metal as we can get.

Will this UI enhancement come to other apps on Windows? Almost certainly not. They have too much going on which is both a good and a bad thing. I'm jealous of their ability to just call one method and layout text in an uncomplicated manner in any language without manually calculating pixels or caring about what styles apply to their font. But my manual pixel calculations, dirty region math, scroll region madness, and more makes it so we go faster than them. I'm also jealous that when someone says "hey can you add a status bar to the bottom of your window" that they can pretty much click and drag that into place with their UI Framework and it will just work where as for us, it's been a backlog item forever and gives me heartburn to think about implementing.

Will we try to keep it from regressing? Yes! Right now it's sort of a manual process. We identify that something is getting slow and then we go haul out WPR and start taking traces. We stare down the hot paths and try to reason out what is going on and then improve them. For instance, in the last cycle or two, we focused on heap allocations as a major area where we could improve our end-to-end performance, changing a ton of our code to use stack-constructed iterator-like facades over the underlying request buffer instead of translating and allocating it into a new heap space for each level of processing.

As an aside, @bitcrazed wants us to automate performance tests in some conhost specific way, but I haven't quite figured out a controlled environment to do this in yet. The Windows Engineering System runs performance tests each night that give us a coarse grained way of knowing if we messed something up for the whole operating system, and they technically offer a fine grained way for us to insert our own performance tests... but I just haven't got around to that yet. If you have an idea for a way for us to do this in an automated fashion, I'm all ears.

If there's anything else you'd like to know, let me know. I could go on all day. I deleted like 15 tangents from this reply before posting it....

miniksa on 14 Dec 2018

👍336 ❤143 🎉35 🚀27 😄14

All 8 comments

Also, I'm happy to discuss this with you until you're utterly sick of reading it. Please ask any follow-ons you want. I thrive on blathering about my work. :P

Pretty much everything else you've listed has some sort of layer or framework involved, or many, many layers and frameworks, when you start talking about Electron and Javascript. We don't.

If there's anything else you'd like to know, let me know. I could go on all day. I deleted like 15 tangents from this reply before posting it....

miniksa on 14 Dec 2018

👍336 ❤143 🎉35 🚀27 😄14

Speaking of low level Windows APIs, I found that every _user mode_ components of both Console (conhost.exe, cmd.exe ...) and WSL (wsl.exe, LxssManager.dll ...) use C++ _STL_ enormously. For example, in string manipulation, memory allocation, virtual tables etc. Will there be any performance improvement if only C is used?

Biswa96 on 14 Dec 2018

👎24 👀4 👍1

I wouldn't consider those as using C++ STL "enormously". They definitely use some of the collections and perhaps a bit of string manipulation and an algorithm here or there. I feel like there's a lot more to STL than those few bits.

But anyway... most of the things you describe were written straight up in C a while back. We've been selectively using C++ and STL in more and more of them as a conscious tradeoff. As long as we're well aware of what the templates are doing under the hood and are making careful use of the correct ones, we're generally only trading a very small amount of performance while gaining a significant amount of security and programming ease.

Security is actually the big reason why we use STL templates over trying to craft our own structures whenever possible. Using the STL templates for collections and strings generally grants us with bounds checking that would otherwise have to be done manually in an error prone fashion (or not done at all!)

Also I'm not strictly sure that having 17 different queuing and linked list implementations inside the console code when it was straight C was overall better for performance than using std::queue and std::list today. It probably consumed more on-disk space for each individual implementation which is a different type of performance issue (storage and the page-load I/O).

miniksa on 14 Dec 2018

👍40

Thanks a lot for taking the time out to write that reply, and no problem on the praise.

For the past couple of months I was looking for a good terminal set up, and I think ubuntu.exe with tmux is as close to perfection as you can get with tools we have today. The ubuntu.exe terminal itself is blazing fast and tmux gives you all of the qualify of life goodies (tabs, split panes, buffer searching, etc.).

Really looking forward to future releases that make color themes more compatible and properties related enhancements like hotkeys for zooming +/- on the font size.

nickjj on 14 Dec 2018

@miniksa: Thanks for a very interesting post!

we are using darn near the oldest and lowest level APIs that Windows has to accomplish this work.

I've been curious about this for a while. You appear to be referring to the various Win32 APIs (USER32 / GDI32). I've lately become unsure about whether they are still as low-level as one can get in recent versions of Windows (8.0 and later), or whether these APIs have been silently converted to sit on top of other stuff (such as Direct2D, DirectWrite, etc.). How do the older APIs relate to the newer ones? I'd love it if you could clarify that bit!

stakx on 16 Dec 2018

@stakx, I am referring to USER32 and GDI32.

I'll give you a cursory overview of what I know off the top of my head without spending hours confirming the details. As such, some of this is subject to handwaving and could be mildly incorrect but is probably in the right direction. Consider every statement to be my personal knowledge on how the world works and subject to opinion or error.

For the graphics part of the pipeline (GDI32), the user-mode portions of GDI are pretty far down. The app calls GDI32, some work is done in that DLL on the user-mode side, then a kernel call jumps over to the kernel and drawing occurs.

The portion that you're thinking of regarding "silently converted to sit on top of other stuff" is probably that once we hit the kernel calls, a bunch of the kernel GDI stuff tends to be re-platformed on top of the same stuff as DirectX when it is actually handled by the NVIDIA/AMD/Intel/etc. graphics driver and the GPU at the bottom of the stack. I think this happened with the graphics driver re-architecture that came as a part of WDDM for Windows Vista. There's a document out there somewhere about what calls are still really fast in GDI and which are slower as a result of the re-platforming. Last time I found that document and checked, we were using the fast ones.

On top of GDI, I believe there are things like Common Controls or comctl32.dll which provided folks reusable sets of buttons and elements to make their UIs before we had nicer declarative frameworks. We don't use those in the console really (except in the property sheet off the right click menu).

As for DirectWrite and D2D and D3D and DXGI themselves, they're a separate set of commands and paths that are completely off to the side from GDI at all both in user and kernel mode. They're not really related other than that there's some interoperability provisions between the two. Most of our other UI frameworks tend to be built on top of the DirectX stack though. XAML is for sure. I think WPF is. Not sure about WinForms. And I believe the composition stack and the window manager are using DirectX as well.

As for the input/interaction part of the pipeline (USER32), I tend to find most other newer things (at least for desktop PCs) are built on top of what is already there. USER32's major concept is windows and window handles and everything is sent to a window handle. As long as you're on a desktop machine (or a laptop or whatever... I mean a classic-style Windows-powered machine), there's a window handle involved and messages floating around and that means we're talking USER32.

The window message queue is just a straight up FIFO (more or less) of whatever input has occurred relevant to that window while it's in the foreground + whatever has been sent to the window by other components in the system.

The newer technologies and the frameworks like XAML and WPF and WinForms tend to receive the messages from the window message queue one way or another and process them and turn them into event callbacks to various objects that they've provisioned within their world.

However, the newer technologies that also work on other non-desktop platforms like XAML tend to have the ability to process stuff off of a completely different non-USER32 stack as well. There's a separate parallel stack to USER32 with all of our new innovations and realizations on how input and interaction should occur that doesn't exactly deal with classic messaging queues and window handles the same way. This is the whole Core* family of things like CoreWindow and CoreMessaging. They also have a different concept of "what is a user" that isn't so centric around your butt in rolling chair in front of a screen with a keyboard and mouse on the desk.

Now, if you're on XAML or one of the other Frameworks... all this intricacy is handled for you. XAML figures out how to draw on DirectX for you and negotiates with the compositor and window manager for cool effects on your behalf. It figures out whether to get your input events from USER32 or Core* or whatever transparently depending on your platform and the input stacks can handle pen, touch, keyboard, mouse, and so on in a unified manner. It has provisions inside it embedded to do all the sorts of globalization, accessibility, input interaction, etc. stuff that make your life easy. But you could choose to go directly to the low-level and handle it yourself or skip handling what you don't care about.

The trick is that GDI32 and USER32 were designed for a limited world with a limited set of commands. Desktop PCs were the only thing that existed, single user at the keyboard and mouse, simple graphics output to a VGA monitor. So using them directly at the "low level" like conhost does is pretty easy. The new platforms could be used at the "low level" but they're orders of magnitude more complicated because they now account for everything that has happened with personal computing in 20+ years like different form factors, multiple active users, multiple graphics adapters, and on and on and on and on. So you tend to use a framework when using the new stuff so your head doesn't explode. They handle it for you, but they handle more than they ever did before so they're slower to some degree.

So are GDI32 and USER32 "lower" than the new stuff? Sort of.
Can you get that low with the newer stuff? Mostly yes, but you probably shouldn't and don't want to.
Does new live on top of old or is old replatformed on the new? Sometimes and/or partially.
Basically... it's like the answer to anything software... "it's an unmitigated disaster and if we all stepped back a moment, we should be astounded that it works at all." :P

Anyway, that's enough ramble for one morning. Hopefully that somewhat answered your questions and gave you a bit more insight.

miniksa on 17 Dec 2018

👍58 ❤26 🚀4 🎉3 😄1