More

asrp · on May 2, 2024

Very nice work and cool project.

However, the example in "3. Baremetal Cognition" is explained in an overly convoluted way, with many choices that IMO detracts from the point that (I think) you're trying to make. There's typos that makes it even harder to understand.

1. Use something like underscore instead of spaces and, maybe even another character like period instead of newline. You can explain after the section that you could have used space and newline instead of _ and .

2. Immediately after showing

    ldfgldftgldfdtgl
    df_
    _
    dfiff1_crank_f

you can parse it out for the reader, as something like

    "l" 'set-non-delim eval
    "gl" 'set-non-delim eval
    "tgl" 'set-non-delim eval
    "dtgl"
      "\n" 'set-non-delim eval
      "_\n"
        "_\n" 'set-non-delim eval
      'set-ignore eval
    eval

and so on. Or maybe even

    (set-non-delim "l")
    (set-non-delim "gl")
    (set-non-delim "tgl")
    (set-non-delim "dtgl")
    (set-non-delim "_\n")
    (set-ignore "_\n")
    (dtgl)

and only then you'd go through the source, character by character. Just because the source is hard to read by humans, doesn't mean we need to stick to it in an explanatory example.

3. > Delimiters have an interesting rule, and that is that the delimiter character is excluded from the tokenized word unless we have not ignored a character in the tokenization loop, in which case we collect the character as a part of the current token and keep going.

There are four "negations" in this sentence: "excluded", "unless", "not", "ignored" and two turn to explain something ostensible simple: when to end tokens added to the stack (or container). This together with whitelist, blacklist, delim, singlet needs a much cleaner naming and description.

Also set non-delimiter is an extra negation.

4. There's an error right after "Now, for the rest of the code: ". The third line contains two spaces instead of a single one. (Using suggestion 1 would have also avoided this for yourself.)

# Comment about the actual content

5. I can kind of see the rationale for this (which is also explained in the beginning). However, I don't see exactly where we'd set clear boundaries since we can alwasy stuff semantics into the initial parser. For example, instead of have `f` bound to eval, we could have set `f` to execute the entire bootstrapping sequence and then rebind `f` to eval. So the entire example would be reduced to just `f`.

I guess we'd have to argue about the initial set of functions we are allowed to are somehow primitive enough. But even `d` (set-non-delim) while it only toggles some values in an array (or list) piggybacks on the parsers inherent ability to skip characters in its semantics and `i` (set-ignore) needs inversion implemented in the parser.

6. Here we assume that one byte per character is the default starting state of the world but unicode and other encodings don't have this so you'd need some parser to be get started anyways. And in that case, is an initial parser using space and end of line as separators really unusual?

7. I don't see why (not) reading ahead would such an important property for modifiable syntax. You just need to not really ahead too much, like the entire rest of the file or stream.

8. Regarless, I think this is worth exploring but also keep in mind some of these questions while doing that.

asrp · on July 4, 2022

After each outer loop iteration, A[:i] (the array up to i) is in ascending order with the max of A at A[i].

This is true the first iteration since max(A) is eventually swapped to A[1]. This is true in subsequent iterations since during the ith iteration, it inserts the next element, initially at A[i], into A[:i-1] and shift everything up with swaps with A[i] so A[:i] is sorted, with the max of A moved to A[i]. After that no more swaps happen in the ith iteration since A[i] contains the max of A.

asrp · on March 28, 2021

Is there some place STEPS fans can gather and gather our notes? There are archives of the FONC mailing list here [1].

I'm an outsider and also never got Frank to work. I was waiting for the Nile/Gezira thesis to get a high level (but hopefully also some detailed) descriptions) of how they handled graphics. I vaguely remember getting parts of idst working but for each of these projects, there were always multiple versions lying around. Sometimes in odd places.

I read Alex Warth's thesis and it's well written, in a way that makes it very easy to understand. So, of course, I had to implement my own OMeta variant [2].

Also, the VPRI website itself says it's shutting down (presumably folks moved to HARC at that time?).

Edit to add that OMeta is the language agnostic parser and compiler!

[1] https://www.mail-archive.com/fonc@vpri.org/ [2] https://github.com/asrp/pymetaterp

elgertam · on March 28, 2021

> Is there some place STEPS fans can gather and gather our notes? There are archives of the FONC mailing list.

Maru development is documented on an active mailing list.[1] Ohm development is being coordinated through GitHub. I'd personally like to take the extant code from OMeta/JS and the JS implementation of Nile & Gezira, and modernize them.

Recently I've been wondering if there's enough interest for a Discord server or something. (In the spirit of STEPS, it'd be ideal to make a new collaborative thing that's really different than static text/audio/video on the web, but gotta start somewhere. :) ) Unfortunately, I have had other, higher-priority projects at the moment, so I have taken no initiative to try to build a community.

I will also say that in my opinion, it's not clear to many of the people who made this stuff how special it is. The only exception to that is Bret Victor, who actually is not well-understood, but even the banana pudding versions of his ideas are typically much better than the industry's.

> I'm an outsider and also never got Frank to work. I was waiting for the Nile/Gezira thesis to get a high level (but hopefully also some detailed) descriptions) of how they handled graphics. I vaguely remember getting parts of idst working but for each of these projects, there were always multiple versions lying around. Sometimes in odd places.

I've never gotten Frank to work, and I abandoned my attempts. I've seen it run, though. The name was fully truthful: it really is Frankenstein's monster.

I did get Nile + Gezira to work (albeit in a very crude way by printing numbers to the console rather than hooking it up to a frame buffer). That's how I met Dan. I don't want to betray any confidences with him, but there is ongoing work with Nile.

Here's Dan himself presenting a related language at Oracle Open World in a demo (around 25 mins in).[2] (Full disclosure: I worked on the demo.)

If it were me getting started, I would take a look at the JavaScript implementation of Nile in Dan's Nile repo on GitHub. It should more or less work out of the box, and there's an HTML file containing a fairly full subset of Gezira. The only problem is that the JS style is way out of date, and so it does some things that are heavily frowned upon today. It may not work with tools like Webpack.

The Maru-based Nile is trickier to get working, but it does work. The issue with Ian's Maru is that it's quite hard to reason about and lacks clear debugging tools. I've gotten both up and running. I seem to remember the Boehm GC was pivotal in getting Maru to bootstrap and then run Nile.

> I read Alex Warth's thesis and it's well written, in a way that makes it very easy to understand. So, of course, I had to implement my own OMeta variant [2].

Pymetaterp is cool! I agree: Warth's work on OMeta was impressive. In some ways, Ohm feels inferior to me, though they're both good tools with lots of potential.

OMeta is the one tool from STEPS that is basically simple to understand and use without having to do a bunch of code archaeology.

> Also, the VPRI website itself says it's shutting down (presumably folks moved to HARC at that time?).

VPRI closed because STEPS ended and because Alan had to retire at some point. HARC and/or CDG Labs continued the work, but then closed as well. (I don't know all of the details, but someone here suggested SAP withdrew funding. That would track with what I do know.)

Today, Ian is teaching in Japan, Dan is at Vianai, Alex is at (IIRC) Google, Yoshiki is at Croquet, Bret Victor is doing Dynamicland, Vi Hart is at Microsoft Research and then Alan is retired. There were quite a few others I'm missing, and they are all doing interesting things as well.

[1] https://groups.google.com/g/maru-dev

[2] https://www.oracle.com/openworld/on-demand.html?bcid=6092429...

asrp · on April 1, 2021

I've put up a barebones Slack [1] and editable Wiki [2]. I might fill that with info I have in the coming weeks since I realized all I had were scattered files.

Diving into this a bit, I remembered that fonc had it's own (now defunct) wiki. [3] It seems like a lot of the important pages were unfortunately not updated though.

[1] https://join.slack.com/t/footprintsorg/shared_invite/zt-o7ch... [2] https://hackmd.io/SB4QqG7bSxmgoUvPPoSzUA [3] http://vpri.org/fonc_wiki/ [3, archive.org] https://web.archive.org/web/20110901193854/http://vpri.org/f...

azeirah · on March 28, 2021

> I will also say that in my opinion, it's not clear to many of the people who made this stuff how special it is. The only exception to that is Bret Victor, who actually is not well-understood, but even the banana pudding versions of his ideas are typically much better than the industry's.

I would love to hear more about how you believe not only outsiders, but also the people who made this misunderstand this work?

How do you see the importance of STEPS and Bret Victor's work? I'm a big fan, and you clearly have a lot of knowledge. I'd love to read more!

asrp · on March 28, 2021

Thanks, a lot of this is new and useful to me.

> Recently I've been wondering if there's enough interest for a Discord server or something. (In the spirit of STEPS, it'd be ideal to make a new collaborative thing that's really different than static text/audio/video on the web, but gotta start somewhere. :) ) Unfortunately, I have had other, higher-priority projects at the moment, so I have taken no initiative to try to build a community.

I don't really like Discord because they keep asking for phone verification and early on, they were pretty aggressively shut down alternate client attempts.

What about Mattermost? I could try to set one up though initially, we wouldn't have email notifications or a CDN. Might not be so good if the initial group is small.

Slack? Don't know how they compare to Discord but at least they don't ask for phone verification.

A subreddit? A mailing list? Some kind of fediverse thing?

If there's some possibility of migrating to our own platform, I guess it doesn't matter as much where we start.

I could try to set something up in the coming week. But interest in this HN thread will still have died by that time.

> I did get Nile + Gezira to work (albeit in a very crude way by printing numbers to the console rather than hooking it up to a frame buffer). That's how I met Dan. I don't want to betray any confidences with him, but there is ongoing work with Nile.

Nice! I'm not anywhere near that. I'm still looking for a description of what it _is_ and at a very high level, how does it work internally? Something like "it's mathematical notation to describe the pixel positions/intensities implicitly via constraint equations; it uses a <something> solver for ...". What's in quote could be way off and is from memory of what I remember seeing.

> I've gotten both up and running. I seem to remember the Boehm GC was pivotal in getting Maru to bootstrap and then run Nile.

I also vaguely remember something about getting the right Boehm GC version so that some of

> Pymetaterp is cool! I agree: Warth's work on OMeta was impressive. In some ways, Ohm feels inferior to me, though they're both good tools with lots of potential.

Thanks! I share similar thoughts about Ohm. Having a visual editor is very nice, though I tend to use breakpoints for parser debugging [1].

Edit to add that id-objmodel [2] is another STEPS project I found to be simple and useful as an idea.

[1] See, for example, "Debugging" in https://blog.asrpo.com/adding_new_statement [2] https://www.piumarta.com/software/id-objmodel/

asrp · on March 21, 2021

Sorry if I've asked these years ago and just don't remember the answer.

> The '/copy-to-ebx' after the slashes is just my way of helping the reader understand what the instruction does. I don't want the reader to have to consult the Intel manual for every instruction, even if I'm forcing the writer to do so.

Why not make the comment the instruction and the bytes the (maybe even optional?) comment in that case then?

From your first post.

> The fact that C compilers are written in C contributes a lot of the complexity that makes compilers black magic to most people.

Isn't this more a symptom of C though? I'm hoping this is generally not true if you replace C with other languages (but could be very wrong). But more generally, I'm thinking you could make "the compiler's inner workings is not black magic" a constraint rather than make not writing the higher level language in the higher level language the constraint.

In my case, I tried that first route and then moved to instead having the compiler written in the higher level language but emitting output that's close enough to (my) handwritten lower level language.

I'll have to read your two part post more carefully though. Glad to see this project getting some attention, even though in an unusual fashion.

akkartik · on March 21, 2021

Great questions! I've actually never considered putting the comment first! I'll have to think about that one.

You're right to point out that there are two components to "C compilers written in C make compilers seem complex": the metacircularity, and C-specific difficulties. I think I was focusing on the first when I wrote that, but I can't exclude the possibility you raise. A better language might reduce the need to understand it operationally, by looking under the hood to understand what a line of code is translated to. The Mu way may well be a dead end, since the requirement of understanding translated code restricts how complex compiler optimizations can get. You probably don't want to understand Haskell's loop fusion by comparing source and generated code.

In my mind there's an idea maze where there are 3 major possibilities for improving the future of software:

a) Simple languages and translators that are easy to understand by running them. This is the Mu way.

b) Type-driven languages that are easy to understand by reading them. Haskell and OCaml seem to fit here, and they may well be the right answers.

c) Complex languages that discourage abstractions atop them. This is the APL way, and it too might end up being the right way.

I'm doing a) mostly because it seems to fit my brain better. I just can't seem to get into Haskell or OCaml or APL.

asrp · on March 21, 2021

> I've actually never considered putting the comment first! I'll have to think about that one.

I'm sure there are many competing constraints so definitely don't do it because I'm suggesting this on a whim. :) My reasoning is that as a human reader, the comment is the more readable part, so I'd want to see it first. And for a computer, it probably doesn't care if the op code appears first or not.

> You probably don't want to understand Haskell's loop fusion by comparing source and generated code.

Indeed. But even though C and Haskell are very different, I think they share a common philosophy about compilation where you can basically do whatever you want as long as it still produces the same result.

I vaguely remember looking at Python generate bytecode (with `dis.dis`) and seeing it wasn't too bad. I haven't tried it on a larger program though.

There's tcc (and more recently chibicc that I haven't had a chance to check out yet) that you're probably already aware of. Is the generated output still pretty bad.

I'll also throw my own attempt in the ring

- High level https://github.com/asrp/flpc/blob/master/lib/stage0.flpc - Low level (up to line 45) https://github.com/asrp/flpc/blob/master/precompiled/self.f

even though it's not quite optimized for this purpose and the code itself is still a bit unclean. If there was a syntax highlighter for the low level language, I'd probably highlight "[", "]" and "bind:" as a start. I can try to clarify any obscure syntax or primitive.

Some more general ideas to get aroud the issue. - Invoke optimization only when asked specifically (and apply the optimization locally). That is, optimization would need at least additional syntax in the language. - Explicitly track correspondance between source and target (at the character or token level) and also do this in each optimization pass. Maybe even keep the intermediate values of each pass so you can browse through it like a stack trace.

> In my mind there's an idea maze where there are 3 major possibilities for improving the future of software:

I guess I'm trying another route even though I don't know if it fits the definition of improving the future of software.

d) Have programmers make their own compiler/interpreter and language by giving them the tools and knowledge to do that (more) easily.

This would (hopefully) avoid the black box/magic issue since the programmer would know the details of the inner workings by virtue of having written it. Though I'm most definitely very far from the goal and the questions can be asked about how to improve their target language.

akkartik · on March 21, 2021

Oh your project looks familiar. Though I might have seen it a long time ago. I'll take a closer look.

> My reasoning is that as a human reader, the comment is the more readable part, so I'd want to see it first. And for a computer, it probably doesn't care if the op code appears first or not.

Yeah, for sure. One rebuttal that comes to mind is the dictum, "don't get suckered by comments, debug code." Comments are useful, but too much emphasis on them has led to dark times in my past :)

Still very worth considering.

asrp · on March 21, 2021

I've read through more of you post can came across the bottom comment (don't know how to permalink to it) which better expresses my comment above.

> An optimizing linter has the problem of being destructive. It goes like this:

> The programmer will write his or her program in a readable way. They'll run it through the compiler, which points out that something can be optimized, the programmer—having already gone through the process of writing the first implementation with all its constraints and other ins and outs fresh in their mind—will slap their head and mutter "of course!", and then replace the original naive implementation with one based on the notes the compiler has given. Chances are high that the result will be less comprehensible to other programmers who come along—or even to the same programmer revisiting their own code 6 months later.

Also a data point and word of warning about (lack of) optimization. My own projects (one of which was mostly hand-written in x86 assembly) have been pretty heavily stalled from speed issues, that sent me on significant detours. Since you are working with your own compiler/interpreter to implement your levels, you are directly affected by their compilation speeds as you iterate. Even with modern hardware, they can quickly become too slow to be even usable.

This is unfortunately another consequence of having too much black magic in (C) compilers. So we get the wrong intuition about how fast computers are.

akkartik · on March 21, 2021

Were your languages very high-level? If so, that kinda rhymes with my experience on past projects. The more expressive the language, the easier it is for programs to create combinatorial explosions that slow everything down if compiled naively.

asrp · on March 29, 2021

The language is high-level but I wouldn't necessarily say very high level. But because I'm trying to spin up language features at runtime (like a lot of Forth does), there are a few layers on top of the language primitives.

I wish there was some framework for me to add optimizations as I go along, especially if there could be some speed gauranttee. Though in my case, I'd like to also not lose the relation to the original source (like C does when values are optimized out).

asrp · on April 22, 2020

If you want to read this, I'd suggest looking at the sources in bootstrap sequence from the readme (boot.flpc, stage0.flpc, ...). Alongside, run some of the precompiled entries by hand by pasting it into the interpreter. Call `ps` once in a while to see the current state. Then for larger chunks of code invoke breakpoints by calling `debugger()` (in a `.flpc` file) or `debugger` (in a `.f` file though this will mess up source position printing beyond this point).

And of course, feel free to just ask!

asrp · on April 22, 2020

It is actually not hard to read (at least in the sense of knowing what something will do when executed; getting the bigger picture takes more practice). The (base) syntax is just whitespace delimited tokens, each representing a function call (or string but we'll come to that later). So

    foo bar baz

will call the 3 functions in order

foo() bar() baz()

All functions are nullary (with side effects; these side effects determine their "true" arity). There aren't really any special characters other than whitespace so

    1 1 + print

just translates to

    1()
    1()
    +()
    print()

Function names do not have to start with a letter or be alphanumeric. I've happened to name my function so that those ending in a colon treats the next token to the right as a string instead of a function call. The [ function treats everything as strings until ] (that is a single close square bracket as a token) and puts the function in those body in a quote, effectively creating an anonymous function. So

[ foo bar baz ] bind: somename

defines a function. The equivalent in Python would be

    def somename():
        foo()
        bar()
        baz()

And you can later call somename in later functions.

asrp · on April 22, 2020

Here's what I was thinking when I wrote the "incorrect assumption" line. But I think many other interpretations are also valid.

You can think of

    sum([x*x for x in range(10)])

as "desugaring" to

    sum(list_comp(lambda(x, quote(multiply(x, x))), range(10)))

which looks like Lisp if you move every open parens one token to the left and remove the commas

    (sum (list_comp (lambda x (quote (multiply x x))) (range 10)))

To evaluate this, parameters are first recursively evaluated (in order) and then the function is called on the outer value. Let's ignore the lambda for the moment.

    (sum (list_comp quoted_inner_func (range 10)))

results in the following function calls are made at execution time _in this order_

    quoted_inner_func 10 range list_comp sum

Normally, you'd have to pass the correct parameter to each function. However, in Forth, we use a global parameter stack so provided all the functions respect their inputs and output, running the above body would provide the desired result on the parameter stack!

jjcc · on April 22, 2020

Thanks. That's a good explanation. I like "desugaring"

asrp · on April 21, 2020

Author here! Go ahead and ask if you have any questions about internals or otherwise. Or e-mail me if you think of your question much later (e-mail in profile).

I'm surprised it made it here even during a Github outage.

asrp · on Nov 24, 2018

I found your game to be a great improvement over the course's UI. Well done!

asrp · on Sept 9, 2018

I'd be very interested in such a community if it existed. I'd go as far as creating/hosting/maintaining such a place if needed.

lisper · on Sept 9, 2018

I suggest starting with a Google Group. That's trivial to set up. I'd be interested in participating too.

mncharity · on Sept 12, 2018

I wonder what success looks like?

Here's another use case to add to "squishification". Some weeks back, I was exploring how temperature is taught, preK-12. I noticed that mention of 'Sun heats Earth' was wide-spread, but that 'Earth is cooled by the deep-space sky' was almost never mentioned. Half of the energy balance was ignored. So explanatory leverage is left on the table - "Why are nights cold? Especially with clear sky? Especially in the desert? Why are mountains snow-capped? Why is winter colder?" etc. It doesn't seem inaccessible - "Between bright hot Sun, too hot, and dark cold deep-space sky, too cold, Earth spins, mixing too hot, and too cold, into not too bad." Like a person huddled next to a campfire or heater, turning around to warm their back. Spacecraft "barbeque roll" thermal management. Earth's surface as thermal mass for peak smoothing.

Maturing the idea to that point, and then finding and fleshing out opportunities for leverage, benefits from a diversity of expertise. Physics, teaching (various ages), engineering, planetary geology, etc.

So I wonder if it would be useful to think in terms of not just discussion, but also of leveraging existing communities? Orchestration, federation, cross pollination. So bits about radiative cooling rates could go to PhysicsForums.com; about 'why the sky is cold' to /r/AskScienceDiscussion; about 'nice videos of spacecraft doing bbq rolls' to /r/spacex; about teaching aspects to... sigh, it's a mess of mailing lists and blogs and... well, maybe prototypes to teacherspayteachers?; and so on. All pointing back to someplace able to coordinate the input.

Thoughts?