@abhibeckert

abhibeckert@beehaw.org · edit-2 1 year ago

If by document you mean “any kind of data structure”, then yes, those are documents

Yep — that is what I mean by documents, and it’s what I meant all along. The beauty of documents is how simple and flexible they are. Here’s a URL (or path), and here’s the contents of that URL. Done.

But then the term becomes meaningless, as literally anything is a document.

No, because you can’t store “literally anything” in a Postgres database. You can only store data that matches the structure of the database. And the structure is also limited, it has to be carefully designed or else it will fall over (e.g. if you put an index on this column, inserts will be too slow, if you don’t have an index on that column selects will be too slow, if you join these two tables the server will run out of memory, if you store these columns redundantly to avoid a join the server will run out of disk space…)

Sure, but then finding that document takes 5 minutes

Sure - you can absolutely screw up and design a system where you need to read millions of files to find the one you’re looking for, but because it’s so flexible you should be able to design something efficient easily.

I’m definitely not saying documents should be used for everything. But I am saying they should be used a lot more than they are now. It’s so easy to just write up the schema for a few tables and columns, hit migrate, and presto! You’ve got a data storage system that works well. Often it stops working well a year later when users have spent every day filling it with data.

What I’m advocating is stop, and think, should this be in a relational database or would a document work better? A document is always more work in the short term, you need to carefully design every step of the process… but in the long term it’s often less work.

Almost everything I work with these days is a hybrid - with a combination of relational and document storage. And often the data started in the relational database and had to be moved out because we couldn’t figure out how to make it performant with large data sets.

Also, I’m leaning more and more into using sqlite, with multiple relational databases instead of just a single database. Often I’m treating that database as a document. And I’m not alone, Sqlite is very widely used. Document storage is very widely used. They’re popular because they work and if you are never using them, then I’d suggest you’re probably compromising the quality of your software.

abhibeckert@beehaw.org · edit-2 1 year ago

I’m 99% certain this is wrong

? This is how Postgres stores data, as documents, on the local filesystem:

There are hundreds, even thousands, of documents in a typical Postgres database. And similar for other databases.

But anyway, the other side of the issue is more problematic. Converting relational data to, for example, a HTTP response.

Persisting data as documents would be atrocious for performance.

Yep… it’s pretty easy to write a query on a moderately large database that returns 1kb of data and takes five minutes to execute. You won’t have that issue if your 1kb is a simple file on disk. It’ll read in a millisecond.

abhibeckert@beehaw.org · edit-2 1 year ago

The article you linked disagrees - they said it pretty well:

Of course, some issues come from the fact that people are trying to use the Relational model where it doesn’t suit their use case. That’s why I prefer a document model instead of a tabular one as the default choice. Most of our applications are more suitable for it, as we’re still moving the regular physical world (so documents) into computers. (Read also more in General strategy for migrating relational data to document-based).

I never joined the NOSQL hype-train so I can’t comment on that. However I will point out storing documents on a disk is a very well established and proven approach… and it’s even how relational databases work under the hood. They generally persist data on the filesystem as documents.

Where I find relational data really falls over is at the conversion point between relational document representation. That typically happens multiple times in a single operation - for example when I hit the reply button on this comment (I assume, haven’t read the source code) this is what will happen:

my reply will be sent to the server as a document, in body of a HTTP request
beehaw’s server will convert that document into relational data (with a considerable performance penalty and large surface are for bugs)
PostgreSQL is going to convert that relational data back into a document format and write it to the filesystem (more performance issues, more opportunities for bugs)

And every time the comment is loaded (or sent to other servers in the fediverse) that silly “document to relational to document” translation process is repeated over and over and over.

I’d argue it’s better, more efficient, to just store this comment as a document because over and over and over it’s going to be needed in that format and anyway you ultimately need to write it to disk as a document.

Yes - you should also have a relational index containing critical metadata in the document. The relationship linking that document to the comment that I replied to. The number of upvotes it has received. Etc Etc… but that should be a secondary database, not the primary one. Things like an individual upvote should also be a document, stored as a file on disk (in the format specified by AcitivtyStreams 2.0).

abhibeckert@beehaw.org · edit-2 1 year ago

I wouldn’t call that “near” SQL, I’d basically just call it SQL. Nothing wrong with that… SQL is great, and using proper language constructs instead of strings makes it even better… but it’s not solving the some problem as an ORM.

abhibeckert@beehaw.org · edit-2 1 year ago

For me the article touches on the problem but doesn’t actually reveal it.

What I see day in and day out is projects using a relational database to store data that is not suited to a relational database. And you can often get away with that fundamental mistake when you’re writing raw SQL queries… but as soon as an ORM is involved you’re in for a world of pain (or at least, problems with performance).

abhibeckert@beehaw.org · edit-2 1 year ago

OK, but what about reading the data?

I’ve never had a problem writing to a database with an ORM. The problems happen when you try to read data (quickly and efficiently).

abhibeckert@beehaw.org · edit-2 1 year ago

WASM allows arbitrary code execution in an environment that doesn’t include the DOM… however it can communicate with the page where the DOM is available, and it’s trivial to setup an abstraction layer that gives you the full suite of DOM manipulation tools in your WASM space. Libraries for WASM development generally provide that for you.

For example here’s SwiftWASM:

let document = JSObject.global.document

var divElement = document.createElement("div")
divElement.innerText = "Hello, world"
_ = document.body.appendChild(divElement)

It’s pretty much exactly the same as JavaScript, except you nee to use JSObject to access the document class (Swift can do globals, but they are generally avoided) and swift also presents a compiler warning if you execute a function (like appendChild) without doing anything with the result. Assigning it to a dummy “underscore” variable is the operator in Swift to tell the compiler you don’t want the output.

abhibeckert@beehaw.org · edit-2 1 year ago

Compile time is my biggest issue with TypeScript. I’ve used JavaScript for decades with compile time measured in, what, a millisecond or two. Having to wait for TypeScript drives me nuts.

abhibeckert@beehaw.org · edit-2 1 year ago

I am an experienced programmer. I can do C/C++/Rust/assembly/Ruby/Perl/Python/ etc… The language itself is not a barrier.

Well, first of all, don’t try to use any of those languages and recognise that the language is a barrier. Choosing the right tool for the job is critical. Those are great languages… but as far as I know there are precisely zero good user interface frameworks available in those languages.

Just like a good function starts by picking a good name and argument list, a good user interface has to start with a good user interface design. Unfortunately user interfaces are complex beasts and it’s virtually impossible to get them right the first time. You absolutely must pick a user interface tool/language/etc which allows you to make major changes (including scrapping the whole thing and stating over) in a short amount of time (minutes, preferably).

The best user interface languages are declarative ones. You should be describing the structure of your interface, largely ignoring the functionality - that’s something which can either be done for you by the framework or done yourself as a completely separate task, in a different file, maybe even a different git repository, and probably a different programming language.

It should be possible to get a rough interactive version of your app up and running very quickly, so you can test it, learn what works/doesn’t work, show it to other people, and you need to be able to rewrite entire sections of the interface by simply rewriting two or three lines of source code.

I recommend HTML/CSS as a good starting point. After you’ve got your head around that first (it won’t take long, it’s relatively simple). After that look into more modern tools like React Native. Learn to crawl before trying to walk.

The article you linked to is just wrong. It suggests this process:

Define a layout for each screen that has UI elements.
Create source code for all of the app’s components.
Build and run the app on real and virtual devices.
Test and debug the app’s logic and UI.
Publish the app

Step 4 needs to be tightly integrated into Step 1. Start working on step 2 after you have finished step 4 (and then, after you’ve done steps 2 and 3, you will need to repeat step 4).

I encourage you to read less articles, they’re often giving really bad advice and without experience it’s impossible to know which ones are good advice. Instead pay for ChatGPT Plus and just ask it questions. “How do I make a button in HTML/CSS” or “how do I make it execute code when the user clicks it” or “how can I deploy a HTML/CSS/JavaScript app on Android”.

abhibeckert@beehaw.org · edit-2 1 year ago

I don’t even know what Turbo 8 is

Maybe you should find out?

The idea behind Turbo is your server sends HTML/CSS to the client, and when the content needs to be updated… the server simply sends new HTML which Turbo will inject into the page. You can also annotate links so they fetch new content from the server instead of navigating to a new URL.

Your server side code can be written in whatever language you prefer… Turbo being a 37Signals project I assume they’re using Ruby. It’d work fine with TypeScript too if that’s your thing. Turbo just uses HTTP / JSON to talk to the server and doesn’t have a server side component.

You can have client side code, but AFAIK there’s pretty minimal interaction with Turbo - you might for example add an event listener that processes the HTML and as converts ISO date/times into Date.toLocaleString().

If you’re writing complex client side code then you shouldn’t be using Turbo at all.

This change doesn’t affect, at all, the language used by users of Turbo. What’s changed is the Turbo dev team themselves have chosen to write Turbo in vanilla javascript. And there are advantages to vanilla JS - it removes the compilation step from one language to another, for example.

abhibeckert@beehaw.org · edit-2 1 year ago

On some unix systems (MacOS for example) you can’t even do that with root.

You’d need reboot into firmware, change some flags on the boot partition, and then reboot back into the regular operating system.

To install a new version of the operating system on a Mac, it creates a new snapshot of your boot hard drive, updates the system there, then reboots instructing the firmware to reboot on the new snapshot. The firmware does it’s a few checks of it’s own as well, and if it fails to boot then it will reboot on the old snapshot (which is only removed after successfully booting on to the new one). That’s not only a better/more reliable way to upgrade the operating system, it’s also the only way it can be done because even the kernel doesn’t have write access to those files.

The only drawback is you can’t use your computer while the firmware checks/boots the updated system. But Apple seems to be laying the foundations for a new process where your updated operating system will boot alongside the old version (with hypervisors) in the background, be fully tested/etc, and then it should be able to switch over to the other operating system pretty much instantly. It would likely even replace the windows of running software with a screenshot, then instruct the software to save it’s state and relaunch to restore functionality to the screenshot windows (they already do this if a Mac’s battery runs really low - closing everything cleanly before power cuts out, then restore everything once you charge the battery).

abhibeckert@beehaw.org · edit-2 1 year ago

When I last used a computer that had a single mode (about 20 years ago), I was in the habit of saving my work about every 15 seconds and manually backing up my documents (to an offline backup that wasn’t physically connected to the computer) multiple times per day.

That’s how often the computer crashed. I never had a virus in those days, it was always innocent and unintentional software bugs which would cause your computer to need a reboot regularly and occasionally delete all of your files.

Trust me, things are better now. I still save regularly and maintain backups, but I do it a lot less religiously than I used to, because I’ve lost my work just once in the last several years. It used to be far more often.

abhibeckert@beehaw.org · 1 year ago

Excel can’t import a CSV file reliably though - and neither can any other spreadsheet software I’ve ever tested. They have problems with dates, numeric values, etc.

The only reliable way to work with CSV is in a programming language of your choice or a plain text editor.