STEPS Toward the Reinvention of Programming (2012) [pdf]

106 points by pcfwik 4 days ago

dang 4 days ago

VPRI - https://news.ycombinator.com/item?id=27722969 - July 2021 (1 comment)

Viewpoints Research Institute concluded its operations at the beginning of 2018 - https://news.ycombinator.com/item?id=26926660 - April 2021 (36 comments)

Final “STEPS Toward the Reinvention of Programming” Paper [pdf] - https://news.ycombinator.com/item?id=11686325 - May 2016 (64 comments)

A computer system in less than 20k LOC progress report - https://news.ycombinator.com/item?id=1942204 - Nov 2010 (3 comments)

A compiler made only from PEG-based transformations for all stages - https://news.ycombinator.com/item?id=1819779 - Oct 2010 (2 comments)

Steps Toward The Reinvention of Programming - https://news.ycombinator.com/item?id=141492 - March 2008 (12 comments)

I feel certain that there were other HN threads related to this project if anyone wants to dig around for some!

corysama 4 days ago

Alan Kay: Is it really “Complex”? Or did we just make it “Complicated”? didn’t get much love 10 years ago.
https://news.ycombinator.com/item?id=9123811
- sgentle 4 days ago
  
  It did get some attention 10 years and 2 months ago ;)
  https://news.ycombinator.com/item?id=8857113

floxy 4 days ago

Any updates on this program in the past 13 years?

morphle 4 days ago

Lots of later follow-up research has been published.
I am proposing to fund a secure parallel operating system, GUI, applications and hardware from scratch in 20 KLOC for the European Community to gain computational independence from the US. I consider it the production version of the STEPS research.
We are in the signing up stage of the researchers, programmers and chip designers and have regular meetings and presentations [1].
Half a trillion Euro's is the wider funding pool, several hundred million for European chips and operating systems, billions for European chip fabs, dozens of billions for buying the secure EU software and chips for government, schools and military.
An unsolved problem is how to program a webbrowser in less than 20 KLOC.
I think that the STEPS research was a resounding succes as was proven by the demonstration of the software system in Alan Kay's talks[2] and confirmed by studying the source code. As mentioned before in my earlier HN post, I have a working version of Frank and most other parts of the STEPS research.
[1] https://www.youtube.com/watch?v=vbqKClBwFwI
[2] https://www.youtube.com/watch?v=ubaX1Smg6pY
- andrewflnr 4 days ago
  
  > An unsolved problem is how to program a webbrowser in less than 20 KLOC.
  Can you even specify a modern web browser in under 20k lines of English? Between the backward compatibility and huge multimedia APIs, including all the references, I'd be surprised.
  - x-complexity 4 days ago
    
    Given the absolute behemoth of scope laid out in the W3C specs, I don't think that's even possible.
    https://codetabs.com/count-loc/count-loc-online.html
    Using LadybirdBrowser/ladybird & ignoring the following folders:
    .devcontainer,.github,Documentation,Tests,Toolchain,Libraries
    ...Yields about 36k lines of C++. With the libraries, the LOC count balloons to 310k.
    If a still-in-alpha browser already has 300k lines of code to deal with, there's very little chance that a spec-compliant browser will be able to do the same within 30k lines.
    
    andrewflnr 4 days ago
    
    I mean, the point of STEPS is in fact to do things in orders of magnitude less code than languages like C++. 310k is almost encouraging. :D
    
    mlajtos 4 days ago
    
    I have never understood why nobody wrote a web browser on top of SmallTalk.
    
    xkriva11 4 days ago
    
    Never? There is a web browser named Scamper.
    
    mlajtos 3 days ago
    
    I know about Scamper, but it is dead.
    I've been thinking about SmallTalk web browser 4 years ago, more in-depth here: https://www.reddit.com/r/smalltalk/comments/jnigzb/native_we...
    Since then, a lot have changed. One dedicated SmallTalker with LLM-infused Squeak might do wonders.
    
    xkriva11 3 days ago
    
    Coincidentally, some weeks ago, I made it somewhat loadable to Pharo 13. But its practical value is, of course, questionable.
    
    mlajtos a day ago
    
    What was your motivation to do this?
    
    xkriva11 a day ago
    
    I made the conversion of the Squeak verison to Pharo many years ago and I just tried to make it work in the latest version (which was not straightforward becasue Pharo deprecated and removed some Morphic parts it used). So, mostly the curiosity if it can still work and, if yes, how well/poorly.
- renox 3 days ago
  
  A resounding failure you mean: they just demoed their SW and didn't provide them in a way where people could build on their research.
  And I didn't see much following research, do you have links?
  You're the third person (at least) who claim to have Frank working but as the other there's nothing concrete..
  I wonder why? Maybe it's a copyright issue..
- sph 3 days ago
  
  I am definitely interested, as someone that has been doing independent research on the work of STEPS and particularly Piumarta and Warth for the past few years — I'm not sure how to get in contact with this initiative. Any pointers?
  Honestly I think the focus should move farther than Smalltalk; it has shown what computers could be like in the 80s, but in the age of multi-core and pervasive networking, some of its ideas do not map well.
  My research these days is on L4-type microkernels, capabilities which are an improvement over "basic" object-orientation, and RISC-V processors with CHERI technology: incidentally I just learned there is a European company working on this integration (Codasip), which would finally allow one to write distributed and secure systems starting from simple primitives such as message passing.
  If you know where to contact people working on this kind of problems, EU-based, I am most interested. Email in profile.
- beagle3 3 days ago
  
  Free (liber) software is already independent of the US by virtue of being open source and free. In what way would your solution offer more/better independence ?
  I am all for a production 20K trusted free+open computing base, but … I don’t understand the logic.
  - crabbone 3 days ago
    
    It's humanly impossible to know what a program does when it grows beyond the size anyone can read in reasonable amount of time.
    For comparison, consider this: I'm in my late 40s and I've never read In Search of Lost Time. My memory isn't what it used to be in my 20s... All eight volumes are about 5K pages, so about 150K lines. I can read about 100 pages per day. So, it will take me about two month to read the whole book (likely a lot longer, since I won't be reading every day, and won't read as many as 100 pages every time I do etc.) By the time I'm done, I will have already lost some memories of what happened two months ago. Also, the beginning of the novel will have to be reinterpreted in the light of what came next.
    Reading programs is substantially harder than reading prose. Of course, people are different, and there is no hard limit on how much of program code one can internalize... but there's definitely a number of lines that makes most programmers unable to process the entire program. If we want programs to be practically understandable, we need to keep them shorter than that number.
    
    beagle3 2 days ago
    
    You have just given the rationale for STEPS, which I am aware of and agree with.
    But the claim was that the EU should embark and find this to “gain independence from the US”, even though free software already gives you that independence.
    So, my question is: in what way would this project make the EU less dependent?
    North Korea reportedly has a Linux distribution, for example.
    
    crabbone 2 days ago
    
    > even though free software already gives you that independence.
    No, not in the way I'd want (and probably not in the way parent wants). For all the same reasons. If you are given something you cannot understand, you depend on the provider for support of the thing you cannot understand. Even if your PC were to be shipped with the blueprints for the CPU, you'd still depend on the CPU manufacturer to make your PCs. The fact that you can sort of figure out how the manufacturer made one doesn't help you to become the real owner of the PC (because of the complexity of the manufacturing process that will make it prohibitively expensive for you to become the PC true owner).
    But, let's move this back into software world, where the problem is just as real (if not more so). Realistically, there are only two Web browsers, and the second one makes every effort to alienate its users and die being forgotten and irrelevant. Chrome (or Chromium and Co) are "free", but they are so complex that if you wanted a substantial change to their behavior, you, alone wouldn't be really able to effect that change. (Hey, remember user scripts? Like in Opera before it folded and became Chromium clone? Was super useful, but adding this functionality back would be impossible nowadays without a major team effort.)
    So... the Chromium and Co aren't really free. They are sort-of free.
    There are, unfortunately, many novel and insidious ways in which software freedom is attacked, subversion attempts come in a relentless tide. Complexity is one of the enemies of software freedom.
    
    beagle3 2 days ago
    
    There are a lot of people in Europe working on KDE, including the really open web browser. They are the provider.
    The problem with a web implementation not being a small thing is inherent due to the size of the spec. You can definitely make a browser with many of the same functional capabilities in 20K lines but it won’t be showing the existing web or be a replacement for chrome.
    Many companies have a customized Linux kernel which means you aren’t actually dependent on the provider.
    In my opinion, the GGP’s claim the EU should fund their STEPS-like project because “it will help avoid American dependence” is … not in line with reality, just a straw man argument to grab available funds.
    Other than that, I agree it’s desirable for everyone to have such a thing. But not in any way because of American hegemony over Chrome.
- vendiddy 2 days ago
  
  I feel that it's worth mentioning that Kay and others believe the web browser has a fundamental flaw: you send data formats that the browser interprets rather than self contained "objects" that know how to execute themselves.
  This is why we've been stuck with tech like CSS, JavaScript, and HTML and it's so hard to break out.
  Their version of a browser would likely be an address bar with the ability to safely run arbitrary programs (objects) in the space below.
  HTML, CSS, and JS would be special cases of this.
- bobajeff 3 days ago
  
  >An unsolved problem is how to program a webbrowser in less than 20 KLOC.
  How about instead of a full web runtime you change the problem to be implementing (or inventing) server and client protocols for common web services? (Vblogging, Q&A forums, micro blogging, social bookmarking, wiki's etc.)
  - beagle3 2 days ago
    
    The reason the web won is that it does NOT need specific clients for every single thing.
    Essentially every kind of service (e.g. email, blogging, q&a, live news) is available without JavaScript, thus, using a pure html through http interface. The problem with a-standard-protocol-per-service is that new uses arrive in a distributed, unplanned manner.
    Looking at instant messaging history is instructive: there were 3 protocols in major use (aim, msn, icq) about 20 other in common use. The “standard” committee was sabotaged by the major players for years and eventually disbanded, culminating in the only open option in some use (not major use, just some use) - XMPP - to win by default, except the providers explicitly chose to NOT interop (Facebook, WhatsApp when it was independent, Google chat).
    
    bobajeff 2 days ago
    
    People still use clients for certain services otherwise many sites wouldn't still be making them.
    Of course you're right that this would all be hardcoded and would not allow new types of sites to work right away.
    I don't know that you'd even need a protocol or service for each category of site. It would probably make more sense to use the same architecture for all types of services with something like a manifest to determine the mode. I think the challenge would be making the APIs public in a way that would be practical for the servers implementing them.
    
    beagle3 2 days ago
    
    There is one such client architecture. It’s called the web. You are describing the present state.
    
    bobajeff a day ago
    
    I think you misunderstood me. I'm not at all describing anything like a application runtime such as the web. By architecture I meant something more like server APIs that are flexible enough to be used as the backend in different kinds of sites.
    For example: Instead of an API for microblogs and another for blogs and another for news sites it could just be one API with flags or something that determines which other calls are used and how.
    
    beagle3 a day ago
    
    So let’s say we have blogger with a blogger API, and twitter with a Twitter API, and by some miracle they agree on a merged-with-flags API.
    Along comes Tumblr, and … the APIs don’t fit, in the sense that they significantly limit the planned functionality.
    Now what? A new API! Or a committee to update the existing API!
    Central planning doesn’t work for these things. And when you do finally have agreement on something (e.g. email), you either stagnate because no one can improve things, or get incompatible implementations because a significant player decides they will break the standard for some reason (e.g. recall and TNEF on outlook, which only work on outlook).
    The internet started the way you describe (with finger .plan for “social” status, Unix talk for one-to-one, IRC for chat, email, nntp, gopher, ftp, Xwindows, RDP, etc etc). All of these is are 98% web based these days, because the protocol+client+server model cannot adapt quickly enough. The modern web with presentation layer + runtime code delivery + transport protocol does allow lightning fast evolution.
    
    bobajeff a day ago
    
    The point of the flags, manifests or whatever is so functionally can set by the site. Like one site wants to support live chat and another wants to support bulletin board style posting.
    The web is the best application runtime for making clients. However, I don't think it's existence invalidates the creation of these kind of protocols and server APIs. In fact some web standards such as RSS feeds could be described as such.
  - mikedelfino 3 days ago
    
    This is something I often think about — if I understood you correctly. It sounds like an evolution of Gopher, with predefined structures for navigation, documents, and media. When we browse, we care more about the content than the presentation. There’s no real need for different layouts for blogs, documentation, news, forums, CRUD apps, streamings, emails, shops, banking, and so on. If the scope were tightly restricted, implementing clients and servers would be much simpler. But it's just something I wonder about...
    
    bobajeff 3 days ago
    
    Yeah, that's right. Though, it needs not be just one protocol. Many sites already have clients. It's just that the APIs are typically controlled by the site and are not client neutral and require credentials as opposed to something like an RSS feed.
- e12e 4 days ago
  
  > ... studying the source code. As mentioned before in my earlier HN post, I have a working version of Frank and most other parts of the STEPS research.
  Are the sources published?
  - morphle 4 days ago
    
    Yes. Some need to be ported from 32 bit to 64 bit.
    I have most of it in working condition or recompiled.
    
    eterps 4 days ago
    
    If you've managed to get most of it working or recompiled, please consider writing a detailed blog post documenting your process.
    This would be an invaluable resource for the people who are interested in the results of the STEPS project, showing your methodology and providing step-by-step instructions for others to follow.
    I don't think you realize how many people have attempted this before and failed.
    
    EgoIncarnate 4 days ago
    
    >> Are the sources published? > Yes
    Where?
- andrekandre 4 days ago
  > An unsolved problem is how to program a webbrowser in less than 20 KLOC.
  that would be amazing if possible, but i wonder since "the web" is so full of workarounds and hacks would it really be usable in most scenarios of done so succinctly...
  - mlajtos 4 days ago
    
    I propose a different lens to look at this problem.
    A neural net can be defined with less than 100LoC. The knowledge is in the weights. What if we went from source code of the web (HTML, CSS, JS, WASM) directly to generated interactive simulation of the said web? https://gamengen.github.io
    What if this blob of weights could interpret way more stuff, not just the web?
    
    ptx 3 days ago
    
    Yes, what if instead of the computer being an Internet Communications Device (as Steve Jobs called the iPhone), it would just pretend to allow us to communicate with other humans while actually trapping us in a false reality, as if we were all in the Truman Show?
    It might work, as indicated by the results in your link ("Human raters are only slightly better than random chance at distinguishing short clips of the game from clips of the simulation."), but the result would be a horrific dystopian nightmare, so why would we do this to ourselves?
    Anyway, there is one aspect where the STEPS work is similar to this idea, in that it tries to build a more concise model of the system. But it does this using domain-specific languages rather than lossy blobs of model weights, so the result is (ideally) the complete opposite of what you proposed: A less blobby, more transparent and more comprehensible expression of what the system does.
    
    mlajtos 3 days ago
    
    Extremely valid counter points, thank you.
    We already interact with false reality through our "old school" computers – internet is full of bots arguing with each other and with us. But my proposition doesn't have to distort the interpreted content.
    Neural nets (RNNs) are Turing-complete, so they can simulate web browser 1:1. In theory, of course. Let say we find a way to train a neural net to identically simulate web browser. The weights of this blob might at first seem like an opaque non-sense, but in reality it would/could contain a more efficient implementation than whatever we have came up with.
    Alan Kay believed computer science should take its cues from biology. Rather than constructing software like static buildings, we ought to cultivate it like living, evolving organisms.
    
    01HNNWZ0MV43FF 4 days ago
    
    Then I would need a thousand dollar GPU to run the simplest JavaScript or decode one image?
    
    mlajtos 3 days ago
    
    No, GPUs are not needed for efficient inference.
    https://arxiv.org/pdf/2411.04732
- linguae 4 days ago
  
  I’m very fascinated by this, and I hope that your proposal gets approved! I’m a community college instructor in Silicon Valley, and my plan this summer (which is when I have nearly three months off from teaching) is to work on a side project involving domain-specific languages for systems software. I’ve been inspired by the STEPS project, and I dream of systems being built with higher levels of abstraction, with very smart compilers optimizing them.
  - morphle 4 days ago
    
    Why not collaborate, it will help you avoid reinventing some wheels.
    For example make a 3D version of the 2.5D graphics Nile/Gezira. You could do it in less than 500 lines of code and within 3 months.
    Other system software could be a new filesystem in 600 lines of code or a TCP/IP in 120 LOC.
    I also think a SPICE or physics simulator could be around 600 lines of code.
    I'll do the parallelizing optimizing adaptive compilers and autotuners (in Ometa 2).
    I intend to target a cluster of M3/M4 Macs with 32-core CPU, 80-core GPU and 32-core Neural Engine cores with an estimated 80 trillion TOPS and 800 Gbps memory bandwidth each. A smaller $599 base model M4 Mac mini would do between a fifth and a third of that performance.
    Together we could beat NVDIA's complexity and performance per dollar per Watt in a few thousand lines of code.
- kragen 4 days ago
  
  That's pretty exciting!
  - morphle 4 days ago
    
    It is exciting. The life's work of a dozen people.
    Imagine proving the entire IT business field, Silicon Valley and computer science wrong: you can write a complete operating system and all the functionality of the mayor apps (word processing, graphics, spreadsheets, social media, WYSIWYG, browsers) and the hardware it runs on in less than 20000 lines of (high level language) code. They achieved it a few times before in 10000 lines (Smalltalk-80 and earlier versions), a little over 20000 (Frank) and 300000 lines (Squeak/Etoys/Croquet) and a few programmers in a few years.
    Not like Unix/Linux/Android/MacOS/iOS or Windows in hundreds of millions of lines of code but in orders of magnitude less.
    
    kragen 4 days ago
    
    > They achieved it a few times before in 10000 lines, 20000 and 300000 lines and a few programmers in a few years.
    Did they?
    
    kragen 3 days ago
    
    Oh, I see now you've edited that to say:
    > They achieved it a few times before in 10000 lines (Smalltalk-80 and earlier versions), a little over 20000 (Frank) and 300000 lines (Squeak/Etoys/Croquet)
    Smalltalk-80 is significantly more than 10kloc. 6–10kloc gets you a Blue Book virtual machine without the compiler, editor, GUI, class library, etc. The full system is about 80kloc (Cuis is a currently runnable version of it, plus color: https://cuis.st/). Nobody's seen Frank (though you say you have a mostly working copy, I note that you haven't included a link). The 300kloc estimate for Squeak with all the trappings is at least in the ballpark but probably also a bit low.
    None of the three contained spreadsheets, "social media" similar to Mastodon or MySpace, or a web browser, though they did contain a different piece of software they called a "browser" (the currently popular term for such software is "IDE" or "editor".)
    You could implement those features inside that complexity budget, as long as you weren't trying to implement all of JS or HTML5, but they didn't. In the case of Smalltalk-80, those things hadn't been invented yet—though hypertext had, and Smalltalk-80 didn't even include a hypertext browser, though it has some similar features in its IDE. In the other cases it's because they didn't think it was a good idea for their purposes. The other two systems do support hypertext, at least.
    These systems are indeed very inspirational, and they pack a lot of functionality into a small amount of code, but exaggerating their already remarkable achievements does nobody any favors.
    
    abecedarius 3 days ago
    
    Smalltalk-76 was around 10k lines, though probably you need to leave out the microcode/VM to get that number, I forget. (I have the source I'm thinking of on another computer powered down at the moment.) -80 was definitely bigger but -76 was a lot more like it than -72 was.
    
    kragen 3 days ago
    
    Yeah, that seems about right. The Smalltalk-76 VM was pretty small, though. A lot smaller than the -80 VM. I think it's fair to say that Smalltalk-76 had WYSIWYG word processing and graphics, including things like paint programs. Like Smalltalk-80, I think, it's missing spreadsheets, social media, and hypertext browsers.
    
    ruyvalle 4 days ago
    
    you can see for yourself, e.g. by looking at the Smalltalk emulators that run in the browser, reading Smalltalk books, etc.
    I think it's the "blue book" that was used by the Smalltalk group to revive Smalltalk-80 in the form of Squeak. it's well-documented for instance in the "back to the future" paper. I haven't had the fortune of studying Squeak or other Smalltalks in depth but it seems fairly clear to me that there are very powerful ideas being expressed very concisely in these systems. likewise with VPRI/STEPS.
    so although it might be somewhat comparing apples to oranges, I do think when, e.g., Alan Kay mentions in a talk that his group built a full personal computing system (operating system, "apps", etc) in ~20kLOC (iirc, but it's the same order of magnitude anyway), that it is important to take this seriously and consider the implications.
    similar when one considers Sutherland's Sketchpad, Engelbart's mother of all demos, Hypercard, etc. and contrasts with (pardon my French) the absolute trash that is most of what we use today (web browsers - not to knock the people who work on them, some of whom are clearly extremely capable and intelligent - generally no WYSIWYG, text and parsing all over the place, etc etc)
    like, I just saw a serious rendering glitch just now while typing this, where some text that came first was being displayed after text that came later, which made me go back and erase text just to realize the text was fine, type it again, and see the same glitch again. that to me seems completely insane. how is there such a rendering error in a textbox in 2025 on an extremely simple website?
    and this all points to a great deal of things that Alan Kay points out. some of his quips: "point if view is worth 80 IQ points", "stop reinventing the flat tire", and "most ideas are mediocre down to bad".
    
    kragen 4 days ago
    
    Your comment doesn't seem relevant to my question.
    I'm familiar with that work, although I haven't finished reading Engelbart. Some of what I've written about these topics can be found at https://dercuano.github.io/topics/steps.html https://dercuano.github.io/topics/sketchpad.html https://dercuano.github.io/topics/smalltalk.html https://dercuano.github.io/topics/self-sustaining-systems.ht... https://dercuano.github.io/topics/small-is-beautiful.html https://dercuano.github.io/topics/hypertext.html
    You are likely to be particularly interested in my "Commentaries on reading Engelbart’s “Augmenting Human Intellect”", https://dercuano.github.io/notes/augmenting.html.
    
    ruyvalle 3 days ago
    
    guess I misunderstood your question, and also went on a bit of a rant.
    morphle said "you can write a complete operating system and all the functionality of the mayor apps (word processing, graphics, spreadsheets, social media, WYSIWYG, browsers) and the hardware it runs on in less than 20000 lines of (high level language) code. They achieved it a few times before in 10000 lines (Smalltalk-80 and earlier versions), a little over 20000 (Frank) and 300000 lines (Squeak/Etoys/Croquet) and a few programmers in a few years."
    to which you replied "did they?"
    to which I replied something along the lines of "you can take a look at Smalltalk systems" to answer your question. to clarify, I meant you can look at what the extent of what they were capable of is, and look at their code. which, again, to me is a bit apples to oranges, but is nonetheless something that ought not to be dismissed.
    thanks for the links btw!
    
    kragen 3 days ago
    
    Thanks for alerting me to morphle's clarifying edit! I've answered it at https://news.ycombinator.com/item?id=43332195
    The summary is: no, they didn't.
    
    dajtxx 4 days ago
    
    I think one difference in these big pieces of software and the older systems is that the old systems ran on bespoke hardware and only one platform where as UNIX & Windows must support a lot of different hardware.
    That's not to say they do seem far too large for what they do, but it is a factor. How much code in ROM vs loaded at runtime?
    It also depends on what you include in those LoC. Solitaire, paint, etc? I18N files?
    
    ruyvalle 3 days ago
    
    not that I've done the LOC calculations, but I would guess Alan Kay, etc. include, e.g., Etoys, the equivalent of paint, and some games in what they consider to be their personal computing system, and therefore in the ~20 kLOC.
    and hardware is one of the crucial points Alan makes in his talks. the way he describes it, hardware is a part of your system and is something you should be designing, not buying from vendors. the situation would be improved if vendors made their chips configurable at runtime with microcode. it doesn't seem like a coincidence to me that a lof of big tech companies are now making their own chips (Apple, Google, Amazon, Microsoft are all doing this now). part of it is the AI hype (a mistake in my opinion, but I might be completely wrong there, time will tell). but maybe they are also discovering that, while you can optimize your software for your hardware, you can also optimize your hardware for the type of software you are trying to write.
    another point is that any general purpose computer can be used to simulate any other computer, i.e. a virtual machine. meaning if software is bundled with its own VM and your OS doesn't get in the way, all you need for your softare to run on a given platform is an implementation of the VM for the platform. which I think begs many questions such as "how small can you make your OS" and "is it possible generate and optimize VM implementations for given hardware".
    also something that came to mind is a general point on architecture, again from Alan Kay's ideas. he argues that biological systems (and the Internet, specifically TCP/IP, which he argues takes inspiration from biology) have the only architecture we know of that scales by many orders of magnitude. other architectures stop working when you try to make them significantly bigger or significantly smaller. which makes me wonder about much of hardware architecture being essentially unchanged for decades (with a growing number of exceptions), and likewise with software architecture (again with exceptions, but it seems to me like modern-day Linux, for instance, is not all that different in its core ideas to decades-old Unix systems).
    
    crq-yml 3 days ago
    
    In this respect, Kay's methods seem to be merging with those of Chuck Moore: the difference lies in that Moore doesn't seem to perceive software as a "different thing" from hardware - the Forth systems he makes always center on extension from the hardware directly into the application, with no generalization to an operating system in between.
    
    kragen 3 days ago
    
    I think Kay's work is in harmony with that approach, too.
    
    kragen 3 days ago
    
    Etoys doesn't, as far as I know, run on any platforms that are anywhere close to 20kloc. It certainly could, but that's not the same thing.
    
    kragen 3 days ago
    
    I don't think supporting a lot of different hardware is a big factor, but if you think it is, building standardized hardware to your liking is a pretty affordable thing to do today. You can use a highly standardized Raspberry Pi, you can implement custom digital logic on an FPGA, or you can wire together some STM32 or ESP32 microcontrollers programmed to do what you want. Bitbanging a VGA signal is well within their capacity, you can easily get US$3 microcontrollers with more CPU power than a SPARC 5, and old DRAM is basically free.
    I think it's probably possible to achieve the level of simplification we're talking about, but as I explained in https://news.ycombinator.com/item?id=43332195, the older systems we're talking about here are in fact significantly more code than they are being represented as here.
    L11n message catalogs (.po files, which I think is what your remark about i18n is intended to refer to) are not conventionally considered to contain lines of source code.
    You can write a lot of games in not much code, especially when efficiency isn't a major concern. I've made some efforts in this direction myself.
    Without a game engine, Tetris is maybe 200 lines of JS http://canonical.org/~kragen/sw/inexorable-misc/tetris.html, 200–300 lines of C (building on the fairly minimal graphics library Yeso) https://gitlab.com/kragen/bubbleos/-/blob/master/yeso/tetris... https://gitlab.com/kragen/bubbleos/-/blob/master/yeso/tetris..., or 300–400 lines of assembly when optimized for small executable size http://canonical.org/~kragen/sw/dev3/tetris.S https://asciinema.org/a/622461. The fairly elaborate curses Tetris in bsdgames (which I didn't write) is 1100 lines of C. Emacs comes with a tetris.el (which I also didn't write) which is almost 500 lines of Lisp.
    Space Invaders without a game engine, using only a color-rectangle-fill primitive for its graphics, is about 160 lines of JS http://canonical.org/~kragen/sw/dev3/invaders ; when I factored out a simple game engine to make the code more readable, and some added elaborate explosion effects, it's closer to 170 lines of JS http://canonical.org/~kragen/sw/dev3/qvaders. (These have a bug I need to fix where they're unplayably fast on displays running at 120Hz or more.)
    A minimal roguelike, hunting treasures in a colored ASCII-art maze, is 54 lines of Forth: http://canonical.org/~kragen/sw/dev3/wmaze.fs https://asciinema.org/a/672405.
    And, though it doesn't quite rise to the level of being a game, a Wolf3D-style raycasting engine with an editable maze world is about 130 lines of JS http://canonical.org/~kragen/sw/dev3/raycast.
    I didn't write this, but you can write Pong in JS in Notepad in five minutes: https://www.youtube.com/watch?v=KoWqdEACyLI
    Emacs comes with a peg solitaire (which I didn't write) which is under 400 lines of Lisp.
    The Hugi sizecoding competition delivered a lot of games that were under 200 bytes of code: https://www.hugi.scene.org/compo/compoold.htm
    My experience writing these tiny games is that a lot of the effort is tuning parameters. If Tetris speeds up too fast, it's too frustrating; if it speeds up too slowly, it's boring. If the space invaders drop too many bombs, or advance too quickly, it's too frustrating; if too few or too slowly, it's boring. I spent more time on the explosions in Qvaders than on writing the entire rest of the game, twice. This gameplay-tuning work doesn't show up as extra source code. All of these games significantly lack variety and so don't reward more than a few minutes of play, but that's potentially fixable without much extra code, for example with interesting phenomena that stochastically occur more rarely or with more level designs.
    MacPaint, for what it's worth, is about 2000 lines of assembly and 3500 lines of Pascal: https://computerhistory.org/blog/macpaint-and-quickdraw-sour... but you could clearly put together a usable paint program in much less code than that, especially if you weren't trying to run on an 8MHz 68000 with 128KiB of RAM.
mananaysiempre 4 days ago

The linked PDF (2012) is the grant final report.
> VPRI closed because [the] STEPS [grant] ended and because Alan had to retire at some point. HARC and/or CDG Labs continued the work, but then closed as well.
https://news.ycombinator.com/item?id=26608495 (2021)

sunrunner 4 days ago

I couldn't help but notice that the authors were credited "In random order" and am now wondering a) Why not alphabetical? and b) Did they just shuffle the order once or was it "Random until we found an order that happened to match some other criteria we had in mind"

layer8 4 days ago

With PDF, you can in principle have a different random order every time you open the PDF.
- sunrunner 4 days ago
  
  Interesting point! I wonder how that fits in to PDF/A and the idea that an archival format and long-term preservation of the original document...
  - layer8 4 days ago
    
    It’s not possible with PDF/A, which prohibits both JavaScript and PostScript in PDF. (These are the only ways I can think of obtaining a random number in PDF.)
- notfed 4 days ago
  
  Please do link an example of PDF randomness...
  - layer8 3 days ago
    
    PDF lets you run JavaScript (if the PDF viewer doesn’t block it), including on opening a document. Math.random() is available in that environment. JavaScript in PDF allows you to switch the visibility of layers on and off, or to fill the values of text fields. PDF forms sometimes use this to generate likely-unique IDs. Alternatively, PDFs can have embedded PostScript to render contents. PostScript has a random-number function rand built-in. (PostScript support is deprecated in PDF 2.0.)
  - lgas 3 days ago
    
    There's a pseudo random number generator in Doom, there must be some randomness somewhere in here. https://doompdf.pages.dev/doom.pdf
mpreda 4 days ago

It's clear that alphabetical order is open to manipulation. Down that path and everybody in the scientific career will be named A.A.
- lolinder 4 days ago
  
  This is obviously an absurd overextrapolation, and it's unlikely that a significant number of people would actually change their name to exploit it, but the principle is accurate: If alphabetical is used consistently then someone with the last name Zelenskyy will consistently end up last in every list of coauthors, while Adams will consistently come near the top. Even if people intuitively understand that alphabetical ordering is used because all coauthors are equal, the citations will still be for Adams et al., and it's not hard to see how that would give an unfair non-merit-based leg up to Adams over Zelenskyy.
  If applied consistently, random order would be a fairly sound way to ensure that over a whole career no one gets too large a leg up from their surname alone.
  - CyberDildonics 4 days ago
    
    This is obviously an absurd overextrapolation, and it's unlikely that a significant number of people would actually change their name to exploit it, but the principle is accurate:
    That's called a joke.
    
    lolinder 4 days ago
    
    I wasn't criticizing OP's statement, just elaborating on it. And it wasn't a joke so much as a rhetorical device.

artemonster 4 days ago

I really wish someone would pick up ian piumarta’s object system

gokr 4 days ago

You mean COLA etc? Yeah, pretty wild stuff. I admit I couldn't fully grasp it at the time (and most likely not now either!)
https://www.piumarta.com/software/cola/

m3kw9 3 days ago

We have arrived at vibe coding in 2025. Coding with English sentences

lincpa 3 days ago

[dead]

ltbarcly3 4 days ago

This sort of program is always such a huge waste of time. The way progress is made is not by carefully studying all the ways something is currently done and then thinking "maybe we make it graphical". This is the sort of thing that happens when there is too much grant money available and no ideas, so they just put 20 mediocre grad students to work. That is just never going to produce anything that is going to resemble in any way what programming will actually look like when it is 'reinvented'.

Let me guess, they published a bunch of papers, did a bunch of experiments like "lets do X but gui" "what if you didn't have to learn syntax" and then nobody ever did anything with any of the work because it was a total dead end.

Look at how progress has moved in the past. It wasn't from some deliberate process. Generally technology improves in some way, and a person with early access to that advance just immediately applies it to some problem.

The instant computers got powerful enough, someone invented assembly.

The instant computers got powerful enough, they invented lisp and C. There wasn't even a gap of a year in most cases from some new capability being available and someone applying it. There wasn't some grand plan, it was just someone playing with a powerful new capability and using it for something.

This happens across all of human activity. The Wright brothers weren't special geniuses, they happened to be working on kites when engines with just enough power/weight ratio became available to keep a kite in the air on it's own power, and they slapped one of those engines on a kite. If they hadn't done it someone else would have done it a month later because once the technology was available, the innovation was obvious.

You don't make leaps from paying grad students to play around with "how can we make programming better", you get it from all of a sudden an AI can just generate code.

gokr 4 days ago

Perhaps you are aware of who Alan Kay, Bret Victor, Ian etc are, but... anyway, I just wanted to chip in with some context for those who are not aware. Alan Kay led the creation and development of Smalltalk at Xerox PARC, thus creating the first real GUI introducing all the UX elements we today take for granted (windows, buttons etc - yes, the same that Steve Jobs got a demo of), establishing OOP (some would argue Simula was first, but Smalltalk took it 10x further) and much, much more. So his legacy in the world of computing is... legendary I would argue, just google him.
Now, Ian has done truly brilliant VM JIT work in the Smalltalk community and I suspect that Ian did a lot of the DSL related stuff in STEPS, like the TCPIP stack etc.
So this is not just some "random folks" :) and.. some of the most brilliant work comes when you are really not sure where you are going. The original Smalltalk work that gave us OOP and modern GUIs came from Alan's life long task of making computers usable by children in order to amplify education. Most would argue we are still struggling with that task ;) but at least we got a lot of other things out of it!
pcfwik 4 days ago

> Let me guess, they published a bunch of papers, did a bunch of experiments like "lets do X but gui" "what if you didn't have to learn syntax" and then nobody ever did anything with any of the work because it was a total dead end.
This response is very confusing to me, and it seems you have a very different understanding of what STEPS did than I do.
In my understanding, the key idea of STEPS was that you can make systems software orders of magnitude simpler by thinking in terms of domain-specific languages, i.e., rather than write a layout engine in C, first write a DSL for writing layout engines and then write the engine in that DSL. See also, the "200LoC TCP/IP stack" https://news.ycombinator.com/item?id=846028
You seem to think they're advocating a Scratch-like block programming environment, but I'm not sure that's accurate. Can you point to where in their work you're finding this focus?
I too believe STEPS was basically a doomed project, but I don't think it's for the reason you've said (moreso just the extreme amount of backwards compatibility users expect from modern systems).
(--- edit: ---)
> You don't make leaps from paying grad students to play around with "how can we make programming better", you get it from all of a sudden an AI can just generate code.
I think this is a more compelling point, but it doesn't seem to explain things like the rise of Git as "a way to make programming (source control) better," and it's not clear how to determine when something counts as an "all of a sudden" sort of technology. They would probably say their OMeta DSL-creation language was this sort of "all of a sudden" technological advance that lets you do things in orders of magnitude less code than before.
- Veserv 4 days ago
  
  That is not a TCP/IP stack in 200 LoC. The thing described is a TCP/IP segment parser with response packet encoder.
  The “stack” described [1] can not transmit a payload stream. That also means it avoids the “costly (in terms of lines)” problems of transmit flow control and “reliable send” guarantees in the presence of transmit pipelining needed for even modest throughput. For that matter, it does not even reconstruct the receive stream which is, again, one of the more “costly” elements due to data allocation and gap detection. It also does not appear to handle receive side flow control either, but that could be hidden in the TCP receive engine state which is not described.
  These are not minor missing features. The thing described does not bear any resemblance to a actual TCP implementation, instead being more similar to just the receive half of a bespoke stateful UDP protocol.
  Now it is possible that the rest of the TCP/IP stack exists in the other lines, as only about 25-ish lines are written down, but you can conclude almost nothing from the described example. The equivalent C code supporting the same features would be similar-ish (under 100 lines) in length, not 10,000 lines. That is not to say it is not a tight implementation of the feature set, but it is not reasonable to use it as evidence of multiple order of magnitude improvements due to representation superiority.
  [1] https://worrydream.com/refs/Kay_2007_-_STEPS_2007_Progress_R... Page 44
  - pcfwik 4 days ago
    
    > That is not a TCP/IP stack in 200 LoC.
    I agree --- I mostly think it's interesting as one of the most concrete examples of what they claim to have actually done that I've been able to find.
    In general, it's frustrating that, as far as I can tell, they don't seem to have made any of the code from the project open source. Widespread skepticism about their claims due to this is (IMO) justified.
    (Edit: for folks interested, it seems like some of the code has found itself scattered around the Web ... https://news.ycombinator.com/item?id=19844088 )
    
    alexisread 4 days ago
    
    You might want to take a look at the piumarta branch of https://github.com/attila-lendvai/maru/tree/piumarta
    This is Attila's attempt at more formally evolving maru through several iterations, and includes gezeira etc.
    Personally I find able forth's meta compilation to be more manageable (https://github.com/ablevm/able-forth/tree/current)
    But your mileage may vary :)
    Disappointingly, the able gui etc was not open sourced, and I fear lost to time.
- ltbarcly3 4 days ago
  
  I guess I'm making a wider claim about the effectiveness of funding 'directed innovation'.
  Very often innovations can happen if you fund accomplishing some previously unaccomplished task. Building a nuclear bomb, sending a man to the moon, decoding the genome. The innovations come about because you have smart people trying to solve something nobody has ever tried to solve before, and to do that they slightly revolutionize things.
  I'm not aware of a single case where the goal was defined in terms of innovation itself, as in "find a better way to program" and anything useful or even slightly innovative resulted. They are by definition doing something that lots of people are trying to do all the time. It's just very unlikely that you are creating conditions which are novel enough to produce even a slightly new idea or approach.
  Generally what you get is a survey of how things are currently done, and some bad ideas about how you could tweak the current state of affairs a little. But if there was a way to just patch up what we already know how to do then it's very likely someone already tried it, really it's likely 1000 people already tried it.
  - pcfwik 4 days ago
    
    Sorry, added an edit to my above post before I saw this, just to summarize:
    I think that's a more reasonable complaint, but I fear it's too vague to be applicable.
    The STEPS folks would probably say that a modern computing environment in ~20kloc is something that was previously unaccomplished and thought to be unaccomplishable, but you're writing that off/not counting it as such, presumably because it failed.
    On the other end of the spectrum, things like Git (to my knowledge) did come out of the "find a better way to source control" incremental improvement mindset. (Of course, you can say the distributed model was "previously unaccomplished," but the line here is blurry.)
    
    ltbarcly3 4 days ago
    
    I don't think Git itself is a revolution or new technology. It took what people were trying to do (but with extremely frustrating bad user experience, like taking an hour to change branches locally) and just did it very well by focusing on what is important and throwing away what isn't. It's an achievement of fitting a solution to a problem .
    I don't think they DID build a modern computing environment at all. They built something that kindof aped one, but unlike Git they missed the parts that made a computing environment useful for users. It's more like one of those demo-scene demos where you say "Wow how did they get a commodore 64 to do that!?"
    If they did build a modern computing environment with 20k LOC, that is a trillion dollar technology. Imagine how much faster Microsoft or Apple would be able to introduce features and build new products if they could do the same work with 2 orders of magnitude less code! That is strong evidence that this wasn't actually the result.
    
    pcfwik 4 days ago
    
    > I don't think they DID build a modern computing environment at all.
    I agree that, now, after they've tried and failed, we can say they didn't build a modern computing environment in 20kloc.
    My point is just that, when they were pitching the project for funding, there was no real way to know that this "trillion dollar technology" goal would fail whereas nukes/moon mission/etc. would succeed. Hindsight is 20/20, but at the beginning, I don't think any of these projects defined themselves as "doing something that lots of people are trying to do all the time;" instead they would probably say "nobody else has tried for a 20kloc modern computing system, we're going to be the first."
    Given they all promised to try something revolutionary, I'm not sure it's fair to claim after-the-fact how obvious it is that one would fail to achieve that vs. another.
    But I do take your point that it's important in general not to fall into the trap of "do X but tweak it to be a bit better" and expect grand results from that recipe.
Exoristos 4 days ago

"The instant computers got powerful enough, someone invented assembly.
"The instant computers got powerful enough, they invented lisp and C. …
"… The Wright brothers weren't special geniuses, they happened to be working on …"
Even by your own account, it sounds like industrious, visionary researchers have been continuously making ready to exploit new hardware.
- ltbarcly3 3 days ago
  
  No, its just that these things are obvious. There were many many very similar innovations at the same time that were very similar, we just remember the one that won.