Amusingly, I can see a future where LLMs become the default interface most people use making GUIs as we know them obsolete. Each app could provide MCP services, and then you UI would just be a text prompt and a canvas where whatever information you’re looking for is visualized in whatever way you need. The model would take care of coordinating the functionality from different available services and rendering the output on the fly.
That seems extremely frustrating to use. I don’t want or need that compared to what’s currently available.
And of course, there will always be people who want direct access to the underlying command line. Which is unsurprising now and will still be unsurprising in another 20 years.
I don’t think the underlying command line will go away, but if things moved in this direction that would be a benefit for people who use command line directly as well. The big disadvantage of GUI apps the way they work currently is that the UI is tightly coupled to the business logic. This makes it impossible to make scripts that combine functionality from different apps the way you can do with shell utils. In my opinion, this was the wrong direction all along. It would be much better if GUI apps were developed using client/server architecture. The service could then be used headless, and you could use it for MCPs for LLMs or to drive these services directly by hand.
Tbf, that is one thing about GUIs that certainly frustrates a lot of people. And I do quite like the concept of programs and utilities that can be used from a separate graphical interface that you only have to install if you want it, or via a command line, those are neat and an excellent example of a tool that can be used by the completely non-technical and also still provide useful functionality to power-users.
mm yes, i’d love to have a UI for the computer where the same input will produce different results each time (and makes half of it up on the spot) and you can never ever trust anything it outputs but have to check literally every single pixel carefully.
as opposed to entering a cli command you’re familiar with and checking just the little bit of the output that is relevant to you, because you are familiar with the output format and it’s stable.
Yeep. I don’t need a hallucination machine between me and my computer. I hate having to use the CLI, but that’s still better than a fucking chatbot getting in my way.
You wouldn’t be the target audience for this. Vast majority of people using computers aren’t technical, and they very much would prefer just being able to use natural language. The whole hallucination thing is also much less of a problem when the model is simply used as an interface to access services, pull data from them, and present it. In fact, you can already try this sort of functionality with DeepSeek right now, where it has a canvas, and you can ask it to pull some data from the internet and render it. I’m always amazed how technical people are utterly unable to step out of their shoes and imagine what a non technical person might prefer to use.
I dunno. I still think the types of graphical tools we have today are better than a chatbot for an interface. I often struggle with explaining what I want in a way that such a tool would actually return anything useful. I might try it once or twice, but I’d probably go back to a standard graphical interface pretty quick. I’m not a “terminal junkie” or particularly technical, but I absolutely hate the idea of a computer you have to talk to like it’s a person, I suck at dealing with people.
Sure, different people like different things. My original point was that just being able to use natural language would be a benefit for non technical users. Most people struggle with complex UIs or achieving tasks where they have to engage multiple apps. Being able to just explain what you want the way you would to another person would lower the barrier significantly. For technical users, we already have tools that we can leverage, but if MCP services started becoming a common way to build apps, then we’d get benefits from that as well.
People who keep parroting this clearly never bothered actually seeing how these tools work in practice. Go play with DeepSeek canvas and look at how it can render the data you give it. Meanwhile, what you as a technical user prefer is vastly different from what an average person wants.
yeah sure i’ll probably check it out one of these days, but i’ve never yet seen this technology do the same thing twice when you give it the same input twice…
I haven’t seen this to be a problem when you’re using it to pull data from existing sources with tools like MCPs. Specifically, the content stays stable even if there are minor presentation variations. That’s sufficient to be useful in most scenarios. Like if you get it to pull some JSON from a service and render a table or graph it, the content of the presentation will not change.
Amusingly, I can see a future where LLMs become the default interface most people use making GUIs as we know them obsolete. Each app could provide MCP services, and then you UI would just be a text prompt and a canvas where whatever information you’re looking for is visualized in whatever way you need. The model would take care of coordinating the functionality from different available services and rendering the output on the fly.
That seems extremely frustrating to use. I don’t want or need that compared to what’s currently available.
And of course, there will always be people who want direct access to the underlying command line. Which is unsurprising now and will still be unsurprising in another 20 years.
I don’t think the underlying command line will go away, but if things moved in this direction that would be a benefit for people who use command line directly as well. The big disadvantage of GUI apps the way they work currently is that the UI is tightly coupled to the business logic. This makes it impossible to make scripts that combine functionality from different apps the way you can do with shell utils. In my opinion, this was the wrong direction all along. It would be much better if GUI apps were developed using client/server architecture. The service could then be used headless, and you could use it for MCPs for LLMs or to drive these services directly by hand.
Tbf, that is one thing about GUIs that certainly frustrates a lot of people. And I do quite like the concept of programs and utilities that can be used from a separate graphical interface that you only have to install if you want it, or via a command line, those are neat and an excellent example of a tool that can be used by the completely non-technical and also still provide useful functionality to power-users.
mm yes, i’d love to have a UI for the computer where the same input will produce different results each time (and makes half of it up on the spot) and you can never ever trust anything it outputs but have to check literally every single pixel carefully.
as opposed to entering a cli command you’re familiar with and checking just the little bit of the output that is relevant to you, because you are familiar with the output format and it’s stable.
i like having extra cognitive load, it helps.
Yeep. I don’t need a hallucination machine between me and my computer. I hate having to use the CLI, but that’s still better than a fucking chatbot getting in my way.
You wouldn’t be the target audience for this. Vast majority of people using computers aren’t technical, and they very much would prefer just being able to use natural language. The whole hallucination thing is also much less of a problem when the model is simply used as an interface to access services, pull data from them, and present it. In fact, you can already try this sort of functionality with DeepSeek right now, where it has a canvas, and you can ask it to pull some data from the internet and render it. I’m always amazed how technical people are utterly unable to step out of their shoes and imagine what a non technical person might prefer to use.
I dunno. I still think the types of graphical tools we have today are better than a chatbot for an interface. I often struggle with explaining what I want in a way that such a tool would actually return anything useful. I might try it once or twice, but I’d probably go back to a standard graphical interface pretty quick. I’m not a “terminal junkie” or particularly technical, but I absolutely hate the idea of a computer you have to talk to like it’s a person, I suck at dealing with people.
Sure, different people like different things. My original point was that just being able to use natural language would be a benefit for non technical users. Most people struggle with complex UIs or achieving tasks where they have to engage multiple apps. Being able to just explain what you want the way you would to another person would lower the barrier significantly. For technical users, we already have tools that we can leverage, but if MCP services started becoming a common way to build apps, then we’d get benefits from that as well.
People who keep parroting this clearly never bothered actually seeing how these tools work in practice. Go play with DeepSeek canvas and look at how it can render the data you give it. Meanwhile, what you as a technical user prefer is vastly different from what an average person wants.
yeah sure i’ll probably check it out one of these days, but i’ve never yet seen this technology do the same thing twice when you give it the same input twice…
I haven’t seen this to be a problem when you’re using it to pull data from existing sources with tools like MCPs. Specifically, the content stays stable even if there are minor presentation variations. That’s sufficient to be useful in most scenarios. Like if you get it to pull some JSON from a service and render a table or graph it, the content of the presentation will not change.
When I gave Qwen 2.5 VL Instruct the exact same input twice, it produced the exact same output.