jchw 2 days ago

This really would've come in handy when I was debugging my own SQLite parser a couple weeks ago.

One thing that initially confused me was how exactly the pages worked w.r.t. the first page on disk... I misunderstood the SQLite documentation in different ways, but it's really rather simple: the very first page is just treated as containing the file header in it, and it pushes down the rest of the data, making the page shorter than the other pages. You can see that illustrated clearly if you click into the first page of a database using this tool: the database header comes first, then the page header.

This tool will undoubtedly come in handy for anyone who has a reason to be dealing with SQLite data structures directly for whatever reason, especially since the SQLite documentation is a bit terse at times.

  • burntcaramel a day ago

    Yes I’m also working on a SQLite parser, mine is in raw WebAssembly. Is yours open source too? This tool will be so useful. I have basic page reading and parsing of the CREATE TABLE schema: https://github.com/RoyalIcing/SilverOrb/blob/9dacad0ce521b0d...

    My plan is to create a miniature .wasm module to read .sqlite files that works in the browser. It will be in the tens of kilobytes rather than the 1 megabyte that the official fantastic sqlite.wasm is. The reduced download means even on 3G you ought to be able to load within a few seconds. You can use SQLite files as your network payloads then, and perhaps even as the working mutable state synced between server and clients.

    • jchw a day ago

      Mine is in TypeScript and for the purpose of parsing a file that happens to use sqlite, so I don't think I'll bother parsing the CREATE TABLE schema unless I have to. It's not currently posted anywhere but will be open source. It works, but the code isn't particularly great :)

  • hinkley 2 days ago

    I really want a data format that is effectively binary JSON. What is the subset of all of the features of SQLite that makes either a read-only or an updatable data set that is compact. But better searchability than a streaming parser.

    • jchw 2 days ago

      If you want to maintain the properties that SQLite has for read use cases, you'll need to replicate a couple of features. At the very least, you'll probably want the format to still be page-based with a BTree structure. You really could get away with just using the SQLite format if you didn't mind the weirdness; a functional SQLite parser that can read tables would not be a significant amount of code. I think, though, that if you want to read the schema as SQLite understands it, you'd need to interpret the CREATE TABLE syntax, which would make it a bit more complex for sure. Otherwise, you can read tables and columns themselves relatively easily, and the values are all stringified.

      • hinkley 2 days ago

        Yeah if I wasn’t clear I’m talking about a minimal file that SQLite can still open read only without errors, not a third party implementation. Though there might be a few tweaks that would allow SQLite to be a bit more lenient. For instance missing metadata that can be assumed. Maybe b tree nodes exceeding the usual load factor.

    • 79a6ed87 a day ago

      Have you tried MessagePack[0]?

      0: https://msgpack.org/index.html

      • hinkley 16 hours ago

        I would probably just use bson or gRPC. As o clarified elsewhere, I means JSON as an analogy. I want something that can be scanned and queried cheaply.

    • cwmma 2 days ago

      Parquet or some other column oriented data format is probably closest to what you want without getting into indexing your flat files or similar

    • Retr0id 2 days ago

      sqlite itself supports a binary encoding of JSON: https://sqlite.org/jsonb.html

      • hinkley 2 days ago

        When I said binary JSON I didn’t mean literal JSON. I meant “common denominator interchange format”. It’s too chatty by far and has dismal performance for queries. So you’re better off asking a specific question and getting a larger document that could answer many questions that you do t yet have. For CDNs things like this matter a lot.

    • w10-1 2 days ago

      MongoDB's BSON?

      • hinkley 16 hours ago

        Mongo sits on a throne of lies and I will never condone anyone using it for any purpose except to make a joke.

  • invisal a day ago

    Glad you like it.

roelschroeven a day ago

I tried with a simple database, and it just shows me a blank screen. The console in developer tools shows an exception (both in Firefox and Chromium):

    index-CGfYebmt.js:85 Uncaught RangeError: Offset is outside the bounds of the DataView
        at DataView.prototype.getUint32 (<anonymous>)
        at R0 (index-CGfYebmt.js:85:4683)
        at X0 (index-CGfYebmt.js:85:7966)
        at index-CGfYebmt.js:85:8157
        at Array.map (<anonymous>)
        at Q0 (index-CGfYebmt.js:85:8150)
        at index-CGfYebmt.js:85:8240
        at Object.useMemo (index-CGfYebmt.js:48:46087)
        at My.$.useMemo (index-CGfYebmt.js:17:7276)
        at V0 (index-CGfYebmt.js:85:8228)
    R0 @ index-CGfYebmt.js:85
    X0 @ index-CGfYebmt.js:85
    (anonymous) @ index-CGfYebmt.js:85
    Q0 @ index-CGfYebmt.js:85
    (anonymous) @ index-CGfYebmt.js:85
    useMemo @ index-CGfYebmt.js:48
    My.$.useMemo @ index-CGfYebmt.js:17
    V0 @ index-CGfYebmt.js:85
    tc @ index-CGfYebmt.js:48
    bc @ index-CGfYebmt.js:48
    lo @ index-CGfYebmt.js:48
    No @ index-CGfYebmt.js:48
    wm @ index-CGfYebmt.js:48
    Xc @ index-CGfYebmt.js:48
    Ao @ index-CGfYebmt.js:48
    Qo @ index-CGfYebmt.js:48
    me @ index-CGfYebmt.js:25
NoSalt 2 days ago

Really nice, looks great!

However, I am not a fan of uploading databases to "strange" sites on the internet, so I will probably never use this.

  • senkora 2 days ago

    I wish that we had a browser version of “pledge” that would:

    1. Permanently restrict the browser tab from accessing the network.

    2. Show an indication of this in the browser UI outside of the content area.

    It would be perfect for this kind of app.

    • runjake 2 days ago

      In this case, itt's easier to just run it locally: https://github.com/invisal/sqlite-internal

      But you can get sort of what you want by:

      1. Loading a site.

      2. Go into the Web Dev Tools.

      3. Select the Network tab.

      4. Click on the "No Throttling" dropdown and check "Offline".

      5. You should now see a yellow warning icon in your Network tab and it should prevent all network access. If you reload the tab you should get the "No Internet Connection" error page.

      • creatonez 2 days ago

        What a fantastic method, I'm gonna start using this for all the random tools I upload data to expecting they are fully clientside.

    • Timon3 a day ago

      I've long been wishing for the same! With service workers etc. this would be a perfect approach for many small utilities. I've started building things as single HTML files using vite-plugin-singlefile[0], but users either have to manually block network requests, or they have to trust me. It would be awesome if they could be sure that nothing will be uploaded anywhere.

      The pledge would have to be persistent, which would make e.g. service worker updates difficult - ideally the browser should show you that an update is available, and offer to delete all stored information so nothing can be saved to localStorage etc. to be uploaded afterwards.

      [0]: https://www.npmjs.com/package/vite-plugin-singlefile

  • jchw 2 days ago

    AFAICT it's purely frontend, there is no uploading. Aside from running it yourself, another thing you can do is go into devtools and set the network throttling to "offline". (Note that this is not fool-proof, but it does prevent new connections from being established.)

  • invisal a day ago

    It is completely frontend. Everyone can review the code or host it themselve. :)

  • brulard 2 days ago

    There is a github link so you can run it yourself

hamstergene a day ago

This is super cool and will really help anyone studying how to build their own storage engines.

A note: the table that describes the data structures (Field # | Description | Value) should also list the data type (uint8 uint16 int32 text etc.). If it's an enum ("Table Leaf"), reference other possible enum values. If its length depends on another field, reference that field No.

graemep 2 days ago

Very cool.

It would be nice if it tied the format in a bit more to the schema and data being shown, but that is a very minor gripe given how nice a tool this is.

NathanFlurry 2 days ago

I've been enjoying watching you build this on X – nice work.

  • invisal 2 days ago

    Thanks so much :)

mubou 2 days ago

This is really cool.

nraynaud a day ago

this is a wonderful learning tool, thank you.

throw0101d 2 days ago

Meta: always found it interesting that .dev was allowed to be a TLD:

* https://en.wikipedia.org/wiki/.dev

More on-topic: another online option:

* https://sqliteviewer.app

* https://inloop.github.io/sqlite-viewer/

Local app:

* https://sqlitebrowser.org

  • invisal 2 days ago

    This is a different tool—SQLite File Format Viewer is made for anyone interested in exploring the internal structure of SQLite.

  • pppone 2 days ago

    I really like the VS Code extension for sqliteview.app - [0]. It also offers an edit feature for a reasonable price, in my opinion. I'm curious if there are any competing editor alternatives.

    [0] - https://vscode.sqliteviewer.app/

    • Kwpolska 2 days ago

      The only reasonable price for editing a SQLite database is $0.

      • lupusreal 2 days ago

        Or $2000 for a perpetual license to the Sqlite Encryption Extension, if the database is encrypted ;)

        Honestly not a bad price if you're building it into a real product. Steep for hobbyists though.