# Which database system is the most in the spirit of FreeBSD?



## dbdemon (Dec 14, 2021)

Is there a database system which is created, developed and/or designed more "in the spirit" of FreeBSD?

Obviously, Berkley DB was originally created and developed as part of BSD, so is probably a strong contender.

If the question was about Unix rather than FreeBSD specifically, would the answer be different?


----------



## rootbert (Dec 14, 2021)

PostgreSQL: similar licence and a strong focus on clean design


----------



## msplsh (Dec 14, 2021)

Postgres


----------



## covacat (Dec 14, 2021)

here it is the most bsd-ish db first distribution, including documentation and source code

```
whereis -q awk sh grep comm paste sort join|tr " " "\n"|tar --files-from - -cvf freebsd-db-0.0.0.0.1.tar
```


----------



## eternal_noob (Dec 14, 2021)

PostgreSQL is nice once you get used to it. But at first, you have to memorize the weird replacements for MySQL commands, e.g.

```
SHOW TABLES;
```
becomes

```
\dt
```
and so on.


----------



## covacat (Dec 14, 2021)

eternal_noob said:


> PostgreSQL is nice once you get used to it. But at first, you have to memorize the weird replacements for MySQL commands, e.g.
> 
> ```
> SHOW TABLES;
> ...


that's why you use flat files and grep
you can't forget ls ...


----------



## eternal_noob (Dec 14, 2021)

Yeah, you definitely need a cheat sheet for the first few weeks.


----------



## mer (Dec 14, 2021)

OP, please define your "needs" in a database.
Do you need SQL ability?
Or do you need other capabilities?

flat text files and grep/sed/awk work pretty well for a large chunk of things.
But SQL works for different things.


----------



## dbdemon (Dec 14, 2021)

mer said:


> OP, please define your "needs" in a database.
> Do you need SQL ability?
> Or do you need other capabilities?
> 
> ...


grep/sed/awk are cute, but in my view not sufficient for a serious database system.

Personally, I like SQL, but if there is a serious NoSQL solution, then it should be considered. Berkeley DB was originally "NoSQL", but Oracle has since added an SQL interface, if I understand correctly.


----------



## mer (Dec 14, 2021)

dbdemon said:


> grep/sed/awk are cute, but in my view not sufficient for a serious database system.


But you need to define your specific needs.  flat files with grep/sed/awk is more than sufficient for some use cases, but not all.  For some use cases, full SQL capabilities are overkill.  So for some use cases MySQL is fine, others Berkley DB is fine, others you need more.


----------



## hruodr (Dec 14, 2021)

I have some plain text dbs that look up with grep and scripts, 
that works faster than many db engines.
If I want SQL, I try first sqlite3. It is FreeBSD spirit, because pkg uses it.
For client server I used MySQL and tried PostgreSQL, I like the second more.


----------



## dbdemon (Dec 14, 2021)

covacat said:


> here it is the most bsd-ish db first distribution, including documentation and source code
> 
> ```
> whereis -q awk sh grep comm paste sort join|tr " " "\n"|tar --files-from - -cvf freebsd-db-0.0.0.0.1.tar
> ```





mer said:


> flat text files and grep/sed/awk work pretty well for a large chunk of things.


On this topic, I came across this article the other day:








						There's a Relational Database in Your Unix CLI
					

With the unix join command, you can process relational data at the command line. Here's a brief introduction on how to use it.




					spin.atomicobject.com
				




I thought that was novel. But not nearly sufficient for what I want in a database system.


----------



## mer (Dec 14, 2021)

dbdemon said:


> I thought that was novel. But not nearly sufficient for what I want in a database system.


the second sentence here drives what people will say.  I've gone back through this whole thread a couple times and I can't find where you say what you want in a database system.

lay out your requirements and then folks can have a better answer to your question.


----------



## dbdemon (Dec 14, 2021)

mer said:


> the second sentence here drives what people will say.  I've gone back through this whole thread a couple times and I can't find where you say what you want in a database system.
> 
> lay out your requirements and then folks can have a better answer to your question.


I simply wish to learn more about which database systems knowledgable FreeBSD users/admins/developers consider to be "in the spirit" of FreeBSD and Unix. I don't have requirements or needs as such.

I have noted that some of you think sed/grep/awk/sort/join combined with flat text files are one possible option. Thank you.


----------



## astyle (Dec 14, 2021)

To answer the question in the Opening Post, I'd say Berkeley DB - it's what sendmail uses, and it's the original. Yeah, it's not a proper 'relational' database, but it is part of the FreeBSD base.
--
The article OP references in post #12 is most likely about this utility: join(1). A quick read through the manpage reveals that it's really meant to work on a text file. I would not treat it as an alternative to SQL proper. And `/usr/bin/join` is not going to work on BDB or any RDBMS - just totally separate and different things.


----------



## Alain De Vos (Dec 14, 2021)

covacat said:


> that's why you use flat files and grep
> you can't forget ls ...


csv
PS: kyotocabinet & tokyocabinet are interesting.


----------



## covacat (Dec 14, 2021)

Alain De Vos said:


> PS: kyotocabinet & tokyocabinet are interesting.


are those namco/sega/konami/capcom arcade machines ?


----------



## covacat (Dec 14, 2021)

for quick & dirty hacking, embedding sqlite3 > *
client / server postgres is more bsd-ish than mysql


----------



## ralphbsz (Dec 15, 2021)

dbdemon said:


> more "in the spirit" of FreeBSD?


What do you think is the spirit of FreeBSD?

There are lots of coffee places on Shattuck Avenue. I bet lots of database code was fueled by espressos served there. As an example, the origin of Berkeley RISC project (later commercialized as the SPARC) was that John Cocke (the inventor of the first RISC machine) used to go drink coffee with Dave Patterson every few weeks.

Or maybe the spirit of BSD can be found in Kirk McKusick's wine cellar (which has been famously computer monitored for decades, same with his hot tub).

OK, now serious: Instead of jokes about what people in Berkeley drink, please define what you think is that elusive "spirit", and then maybe we can answer it.


----------



## eternal_noob (Dec 15, 2021)

ralphbsz said:


> Kirk McKusick ... computer monitored ... hot tub




I never thought bad about Kirk but this is weird. (If it's true).


----------



## blind0ne (Dec 15, 2021)

covacat said:


> here it is the most bsd-ish db first distribution, including documentation and source code
> 
> ```
> whereis -q awk sh grep comm paste sort join|tr " " "\n"|tar --files-from - -cvf freebsd-db-0.0.0.0.1.tar
> ```


`tar --files-from - -cvf freebsd-db-0.0.0.0.1.tar`
what is this `--files-from -` ? 
The set of programs is quite good advice, but it would be much more pleasant to find out about common practices and set of rules to follow in order to achieve the similar functionality and throw off the shackles of databases at least for the thoughts time.


----------



## hardworkingnewbie (Dec 15, 2021)

INGRES of course, since it was developed back in the old days at Berkeley.


----------



## covacat (Dec 15, 2021)

blind0ne said:


> `tar --files-from - -cvf freebsd-db-0.0.0.0.1.tar`
> what is this `--files-from -` ?
> The set of programs is quite good advice, but it would be much more pleasant to find out about common practices and set of rules to follow in order to achieve the similar functionality and throw off the shackles of databases at least for the thoughts time.


read file names to archive from stdin


----------



## dbdemon (Dec 15, 2021)

ralphbsz said:


> OK, now serious: Instead of jokes about what people in Berkeley drink, please define what you think is that elusive "spirit", and then maybe we can answer it.


I suppose the design philosophy, licencing, development model, and to a lesser degree also origins and history on BSD systems or other Unix systems. And so on ...

But I'm happy to hear other interpretations of "spirit of FreeBSD / Unix".


----------



## blind0ne (Dec 15, 2021)

mer said:


> the second sentence here drives what people will say.  I've gone back through this whole thread a couple times and I can't find where you say what you want in a database system.
> 
> lay out your requirements and then folks can have a better answer to your question.


Are there any working in "production" websites, services on the web that is using such approach as here is mentioned in?


dbdemon said:


> On this topic, I came across this article the other day:
> 
> 
> 
> ...


----------



## dbdemon (Dec 15, 2021)

blind0ne said:


> Are there any working in "production" websites, services on the web that is using such approach as here is mentioned in?


I would be surprised if this was used for anything particularly serious in production. It's a "hack" that demonstrates the power of the Unix command-line tools.

Given the hesitance I'm detecting in this thread to embracing more fully-featured SQL DB systems (although honourable mentions of DB systems such as Postgres), is there a sense that such DB systems are contrary to the design principles/philosophy behind FreeBSD / Unix? Yes, I understand the attitude that you want to use the right tool for the job, and not necessarily choose a tool which is overkill and would introduce unnecessary bloat. However, I'm thinking about how e.g. MySQL/MariaDB can be configured to circumvent the OS and access storage devices directly, and more generally how more fully-featured DB systems tend to replace multiple features already present in the OS with their own, purportedly more performant features. Am I on to something here?


----------



## hruodr (Dec 15, 2021)

dbdemon said:


> I'm thinking about how e.g. MySQL/MariaDB can be configured to circumvent the OS and access storage devices directly, and more generally how more fully-featured DB systems tend to replace multiple features already present in the OS with their own, purportedly more performant features.


BTW, see:



			35% Faster Than The Filesystem


----------



## hardworkingnewbie (Dec 15, 2021)

dbdemon: The point is that your original question right from the beginning doesn't make a lot of sense at all, because what exactly does "created, developed and/or designed more "in the spirit" of FreeBSD?" mean? That's an absolutely wishy-washy phrase, sounds good and totally undefined by yourself, so everybody has a different idea on mind what this should entail, or not.

So you are getting a broad variety of different answers. So this is why you are not on to something here, because your whole question is really muddy.

If it's embedded SQL storage, then SQLite. For everything else - Postgres. End of story.


----------



## dbdemon (Dec 15, 2021)

hardworkingnewbie said:


> dbdemon: The point is that your original question right from the beginning doesn't make a lot of sense at all, because what exactly does "created, developed and/or designed more "in the spirit" of FreeBSD?" mean? That's an absolutely wishy-washy phrase, sounds good and totally undefined by yourself, so everybody has a different idea on mind what this should entail, or not.
> 
> So you are getting a broad variety of different answers. If you want better answers, then you should ask a better question from the start!


The question is deliberately a little vague exactly because I would like to see a variety of answers and reasons.


----------



## astyle (Dec 15, 2021)

dbdemon said:


> I would be surprised if this was used for anything particularly serious in production. It's a "hack" that demonstrates the power of the Unix command-line tools.
> 
> Given the hesitance I'm detecting in this thread to embracing more fully-featured SQL DB systems (although honourable mentions of DB systems such as Postgres), is there a sense that such DB systems are contrary to the design principles/philosophy behind FreeBSD / Unix? Yes, I understand the attitude that you want to use the right tool for the job, and not necessarily choose a tool which is overkill and would introduce unnecessary bloat. However, I'm thinking about how e.g. MySQL/MariaDB can be configured to circumvent the OS and access storage devices directly, and more generally how more fully-featured DB systems tend to replace multiple features already present in the OS with their own, purportedly more performant features. Am I on to something here?


No, you're not on to anything. A proper RDBMS (like MySQL/MariaDB/PostreSQL) is just a very separate technology from an OS. You can have an Oracle install running on a UNIX machine or on a Windows server, and the design of the Oracle RDBMS will be exactly the same in either case.
--
There's a BIG difference between text-processing tools (which at best should not try to process more than 10MB of relatively free-form text) and managing terabytes of well-organized data (which is what RDBMS systems are for).


----------



## ralphbsz (Dec 15, 2021)

eternal_noob said:


> I never thought bad about Kirk but this is weird. (If it's true).


Don't worry, it's nothing weird or indecent. In the mid- or late 90s, Kirk was one of the first people to have a personal web site, and it included some home automation, long before such things were fashionable or easy. And his web site had "real time" tracking of the temperature of his wine cellar (which is very important when wanting to drink wine) and of the hot tub. Those web pages are long gone. Today, we're more concerned about privacy, so for example at our house I can monitor the pressure of our water supply and the fill level of our water tanks, but that information is not on the public web:






dbdemon said:


> However, I'm thinking about how e.g. MySQL/MariaDB can be configured to circumvent the OS and access storage devices directly, and more generally how more fully-featured DB systems tend to replace multiple features already present in the OS with their own, purportedly more performant features. Am I on to something here?


That is a long-running debate. Since time immemorial, databases have used their own storage management, often bypassing the OSes facilities (file systems in particular). This goes back to the mainframes of the 1960s. To some extent, this is good, because real database experts can tune things incredibly well. To some extent, it is bad, because modern storage systems are very fast, and often running a database on top of storage software (file systems, volume managers, RAID, object storage, ...) is actually faster than going directly to hardware. And when using the full-featured storage stack, you get things like snapshots, backups, automated management (space allocation). I think the consensus is that today, databases that bypass the storage software stack are making a short-sighted mistake, but that is sometimes debated, and not true for all situations.


----------



## Jose (Dec 15, 2021)

ralphbsz said:


> That is a long-running debate. Since time immemorial, databases have used their own storage management, often bypassing the OSes facilities (file systems in particular). This goes back to the mainframes of the 1960s. To some extent, this is good, because real database experts can tune things incredibly well. To some extent, it is bad, because modern storage systems are very fast, and often running a database on top of storage software (file systems, volume managers, RAID, object storage, ...) is actually faster than going directly to hardware. And when using the full-featured storage stack, you get things like snapshots, backups, automated management (space allocation). I think the consensus is that today, databases that bypass the storage software stack are making a short-sighted mistake, but that is sometimes debated, and not true for all situations.


My first serious professional mistake was to newfs the raw partitions the Oracle database was using on a Sun server. The Anderson consultants were very annoyed with me.

My checkered past includes some Oracle experience. Oracle went back-and-forth bewteen "use our storage management system" and "use the filesystem" at least twice.


----------



## Alain De Vos (Dec 15, 2021)

My biggest mistakes where not professional. Offcourse i cannot elaborate. But hey, it was a lesson !
And on postgresql : Pipe vs Socket vs IPC.


----------



## dbdemon (Mar 22, 2022)

This thread/topic has been on my mind for some time even if I haven't posted again since last year. I have some more thoughts, but it'll have to wait.

In the meantime, I thought I should share this video demonstrating databases/recutils. Beware, this is a GNU product! But I think it does check a lot of boxes for a "Unix database system".




_View: https://youtu.be/qnlkr3mCqW8 _

(Disclaimer: I'm obviously not saying this sort of software can compete with a RDBMS in terms of performance, scale, user management, table joins and so on.)

See also Wikipedia: https://en.wikipedia.org/wiki/Recfiles


----------



## bakul (Mar 23, 2022)

Hope someone ports visidata to bsd: https://www.visidata.org/
It seems very convenient for tabular data exploration. Here’s a lightening demo: 



_View: https://www.youtube.com/watch?v=N1CBDTgGtOU_


----------



## dbdemon (Mar 23, 2022)

I think the ideal database system "in the spirit" of FreeBSD/Unix would be an RDBMS with both traditional SQL data access and also "text file" interface to the same data.


----------



## astyle (Mar 23, 2022)

dbdemon said:


> I think the ideal database system "in the spirit" of FreeBSD/Unix would be an RDBMS that in addition to SQL access to its data somehow also supported accessing its data tables as if they were text files.


SQL92 offers the API to do a text dump of `select * from database.table`. UNIX also offers plenty of ways to do a text dump. Unfortunately,  just the very design of on-disk files that are accessible via SQL makes those files incompatible with being processed with stuff like grep. It's like trying to open a.out with a text editor - you do need a hex editor to open a.out, or GDB. It's that kind of difference.


----------



## dbdemon (Mar 23, 2022)

astyle said:


> SQL92 offers the API to do a text dump of `select * from database.table`. UNIX also offers plenty of ways to do a text dump. Unfortunately,  just the very design of on-disk files that are accessible via SQL makes those files incompatible with being processed with stuff like grep. It's like trying to open a.out with a text editor - you do need a hex editor to open a.out, or GDB. It's that kind of difference.


(I edited my post to clarify my point, but I think you understood what I meant regardless.)

Yes. But wouldn't it be possible for DBMS vendors to offer a "text file interface" to their tables? At least for reading? Similar to /proc/ or /dev/ filesystems on Unix-like OSes.


----------



## astyle (Mar 23, 2022)

dbdemon said:


> (I edited my post to clarify my point, but I think you understood what I meant regardless.)
> 
> Yes. But wouldn't it be possible for DBMS vendors to offer a "text file interface" to their tables? At least for reading? Similar to /proc/ or /dev/ filesystems on Unix-like OSes.


Umm... due to the very design of databases, this is putting the cart before the horse. The output of the command is in plaintext. Once the command completes, and you have the output, that's when you can put the usual text-processing commands like grep to work. The `select` SQL command is for reading, and `insert` / `delete` SQL commands are for writing.  

BTW, /proc and /dev are not exactly for editing or even reading text. I would strongly suggest that you read a good book to understand what those are for.  Andrew Tanenbaum is a good author of a few of those books.


----------



## dbdemon (Mar 25, 2022)

astyle said:


> Umm... due to the very design of databases, this is putting the cart before the horse. The output of the command is in plaintext. Once the command completes, and you have the output, that's when you can put the usual text-processing commands like grep to work. The `select` SQL command is for reading, and `insert` / `delete` SQL commands are for writing.


I know very well how SQL works, thank you  (After all, my job title is 'senior database engineer'!) I must not be making myself clear. I believe building a different interface than SQL for an SQL database is entirely within the realm of non-fiction. In fact, this was done for MySQL with their document store using their "X protocol". And MariaDB ColumnStore has a Write API which also circumvents the SQL layer to enable more direct write-access to the data. I'm sure there are other examples. But it should also be possible to build a "text file" interface that utilises SQL under the hood.

In fact, it turns out someone did try to do what I'm suggesting, using FUSE - see the README file here:








						GitHub - BMDan/DFuse: DFuse: Database FUSE - MySQL Tables Exposed as a FS via FUSE
					

DFuse: Database FUSE - MySQL Tables Exposed as a FS via FUSE - GitHub - BMDan/DFuse: DFuse: Database FUSE - MySQL Tables Exposed as a FS via FUSE




					github.com


----------



## shkhln (Mar 25, 2022)

dbdemon said:


> But wouldn't it be possible for DBMS vendors to offer a "text file interface" to their tables?


That kind of interface prevents one from effectively using: 1. transactions — your typical file system access API has no commit/rollback functionality; 2. indexes — everything is a full table scan; 3. constraints — POSIX error codes are not going to cut it.

Thus, technical possibility doesn't mean it's a good idea.


----------



## bakul (Mar 25, 2022)

shkhln said:


> That kind of interface prevents one from effectively using: 1. transactions — your typical file system access API has no commit/rollback functionality; 2. indexes — everything is a full table scan; 3. constraints — POSIX error codes are not going to cut it.


In mid/late 1980s there was a unix startup that built in transaction semantics on file system operations. I did some contract work for them but can’t remember their name now — it was some generic name like Relational Systems or something so hard to search for. Its performance was not great — much worse than comparable unix boxes. I think they made the classic mistake of not understanding the difference between a research project with a deliverable product!


----------



## astyle (Mar 25, 2022)

dbdemon said:


> I know very well how SQL works, thank you  (After all, my job title is 'senior database engineer'!) I must not be making myself clear. I believe building a different interface than SQL for an SQL database is entirely within the realm of non-fiction. In fact, this was done for MySQL with their document store using their "X protocol". And MariaDB ColumnStore has a Write API which also circumvents the SQL layer to enable more direct write-access to the data. I'm sure there are other examples. But it should also be possible to build a "text file" interface that utilises SQL under the hood.
> 
> In fact, it turns out someone did try to do what I'm suggesting, using FUSE - see the README file here:
> 
> ...


All you need is a *web-based front-end that talks SQL to the database*... like databases/phpmyadmin... they are a dime a dozen out there.  A 'text-based interface' is a UNIX shell... you start by typing, say 'mysql' into bash, you get dropped into a shell that allows you to type in straight SQL commands and talk to the server that way. Oracle works that way, too, and even MS SQL Server has a utility that works similar.  No need to circumvent anything, just slap on a front-end that reflects your customer's 'Best Practices', and connect it to the proper databases. Yeah, the portion with databases do need to be well-designed, that's the job for a 'senior database engineer'... and I'd expect a  'senior database engineer' to actually know to avoid filesystem shims - that can mess up the file itself to the point that the database is not recoverable. I don't like being this blunt, but I do think it's important to be able to connect the dots properly in this industry.


----------



## Jose (Mar 25, 2022)

dbdemon said:


> I know very well how SQL works, thank you  (After all, my job title is 'senior database engineer'!) I must not be making myself clear. I believe building a different interface than SQL for an SQL database is entirely within the realm of non-fiction. In fact, this was done for MySQL with their document store using their "X protocol"...


Postgresql has Hstore.

A SQL interface to files is essentially what Mysql with the Myisam backend was. It did not support transactions. It was a pretty sad and losing excuse for an RDMBS but gained huge adoption because it was simple and free.


----------



## ralphbsz (Mar 25, 2022)

bakul said:


> In mid/late 1980s there was a unix startup that built in transaction semantics on file system operations. I did some contract work for them but can’t remember their name now — it was some generic name like Relational Systems or something so hard to search for. Its performance was not great — much worse than comparable unix boxes. I think they made the classic mistake of not understanding the difference between a research project with a deliverable product!


There are several things in this area. First, there are file systems that can be "connected" to a database: If you do that, then any file that is mentioned in the database gets transactional semantics, where updates to the file are done under database transactions, so you get ACID properties. The one shipping example I know comes from IBM, which used DB2 and AIX' jfs file system.

Second, there are file systems that allow users to create transactions, and pack multiple read and write operations into a transaction, which is then atomically executed or not. Quite a few research examples exist; the only shipping product ready for use that I remember was a transactional version of Microsoft's NTFS. In the early 2000s, quite a few startups were playing in that field; for some reason, I remember DataBrix (not sure I spelled that right). Internally, many file systems use database techniques such as transactions and logging to achieve consistency.


----------



## bakul (Mar 25, 2022)

ralphbsz said:


> Internally, many file systems use database techniques such as transactions and logging to achieve consistency.


This company allowed *user processes* to implement atomic transaction semantics over *multiple* file operations. This is quite different from using transaction semantics behind the scenes as it were for *individual* file operations. I have a very vague memory of them providing two additional syscalls: begin transaction and commit/abort the current transaction. File ops bracketed by a pair of these calls would be done atomically. But I could be completely wrong! Now I think that this space is still not explored, probably because if you want transaction semantics you’d just use a proper RDBMS.

edit: I imagine a transactional version of NTFS might do something similar.


----------



## ralphbsz (Mar 26, 2022)

Exactly. Transactional file systems do just that: call beginXact, then do a lot of operations under that transaction, then call commitXact (I'm glossing over complicated things like cancel, rollback, collision and all that). There are OODLES of implementation problems here. What happen if the user overwrites the same sector a billion times, alternating between 0x55 and 0xAA? Now in the transaction log we have a billion writes. Where do we store a terabyte of transaction log, so we can undo or redo? What if one program starts a transaction on files a, then adds file b, while the other program starts on b and then adds file a? Sure, you have a deadlock detector, but one of the programs now has to deal with a full cancel. The whole thing is an enormous mess, and requires lots of work. What is the use? Do customers really pay enough to make all the development and testing worthwhile?

And I think in practice your answer is correct: People who really care and up using a database anyway. Problem solved.


----------



## dbdemon (Mar 26, 2022)

astyle said:


> All you need is a *web-based front-end that talks SQL to the database*... like databases/phpmyadmin... they are a dime a dozen out there.  A 'text-based interface' is a UNIX shell... you start by typing, say 'mysql' into bash, you get dropped into a shell that allows you to type in straight SQL commands and talk to the server that way. Oracle works that way, too, and even MS SQL Server has a utility that works similar. No need to circumvent anything, just slap on a front-end that reflects your customer's 'Best Practices', and connect it to the proper databases. Yeah, the portion with databases do need to be well-designed, that's the job for a 'senior database engineer'... and I'd expect a 'senior database engineer' to actually know to avoid filesystem shims - that can mess up the file itself to the point that the database is not recoverable. I don't like being this blunt, but I do think it's important to be able to connect the dots properly in this industry.


I do not think a web-based front-end would generally be considered a tool that follows the Unix philosophy.

I didn't say "text based interface". I said "text file interface". In other words, providing access to the database tables _as if the they were files_ in a filesystem.

I also never suggested I, as a user, wanted to directly access the _actual_ data files of the database. I suggested a DBMS vendor could possibly build a non-SQL interface to their own database system. And then expose the database tables through that interface as (virtual) files in a filesystem.


----------



## astyle (Mar 26, 2022)

dbdemon said:


> I do not think a web-based front-end would generally be considered a tool that follows the Unix philosophy.
> 
> I didn't say "text based interface". I said "text file interface". In other words, providing access to the database tables _as if the they were files_ in a filesystem.
> 
> I also never suggested I, as a user, wanted to directly access the _actual_ data files of the database. I suggested a DBMS vendor could possibly build a non-SQL interface to their own database system. And then expose the database tables through that interface as (virtual) files in a filesystem.





astyle said:


> UNIX is not a religion, buddy.


I've said so in many threads, BTW.

Oh, and I'd like to point out that most RDBMS vendors are not gonna offer a "non-SQL interface to their own database system". They will offer more front-ends like what I was talking about. Sometimes, you gotta wake up to the current industry standards and best practices, instead of recalling some obscure research projects from 15 years ago and imagine that to somehow fit the business scenario that was reviewed like, yesterday.


----------



## dbdemon (Mar 27, 2022)

astyle said:


> UNIX is not a religion, buddy.


Absolutely. But I was talking about what would be an *ideal* database system with respect to the Unix philosophy.



shkhln said:


> That kind of interface prevents one from effectively using: 1. transactions — your typical file system access API has no commit/rollback functionality; 2. indexes — everything is a full table scan; 3. constraints — POSIX error codes are not going to cut it.
> 
> Thus, technical possibility doesn't mean it's a good idea.


I suppose making use of transactions and indexes would require adding special commands: for "start transaction" and "commit", and for searching particular columns. 

Although, if instead a table was represented as a directory, and then within it the individual columns were represented as files (each filename prefixed with the primary key value of the row), then users would not need a special command to specify the column to search on. Also, transactions might not be as important for how I imagine it would be used. (SQL would still be the main interface, whereas the virtual filesystem could be used for more ad-hoc operations.) 

Constraints: Yeah. Some generic error code(s) would have to be used, I suppose. Maybe there could be an error log exposed in a special file within the virtual filesystem. Not perfect.


----------



## bakul (Mar 28, 2022)

Th real question is what do you gain by representing a database as a filesystem?


----------



## ralphbsz (Mar 28, 2022)

A long time ago, there was a research paper that explained how you can use the standard Unix commands sort, uniq, join and awk to implement a makeshift relational database. I can't find it online, and I'm not even sure I've ever seen that paper. Anyone know? It would probably be 30-40 years old by now.


----------



## mark_j (Mar 28, 2022)

bakul said:


> Th real question is what do you gain by representing a database as a filesystem?


I can give you some examples, but time is not my friend at present. Suffice to say, have a read of this, but only pp 29 onwards.

This file system/database allows you to use languages like Basic, Fortran, C, Macro etc on the file system as if it was a database.

It's just a different way of thinking about a file system and is, somewhat, an old/dead concept nowadays but is/was used in other OSs, eg Pick, Prime etc.


----------



## bakul (Mar 28, 2022)

ralphbsz said:


> A long time ago, there was a research paper that explained how you can use the standard Unix commands sort, uniq, join and awk to implement a makeshift relational database. I can't find it online, and I'm not even sure I've ever seen that paper. Anyone know? It would probably be 30-40 years old by now.



There is this 1991 paper: The UNIX Shell As a Fourth Generation Language


----------



## astyle (Mar 28, 2022)

Ahh... sort, uniq, and awk work best on the *output* of SQL commands. Did everyone forget that the strength of UNIX is actually pipes? 

A sysadmin is nothing more than the high-tech equivalent of a plumber who works the toilets and kitchen sinks, and knows better than you why pipes need maintenance, and to be separated based on where the water is coming from.  Sorry to be gross, but any self-respecting plumber would know to never mix the downstream outputs from toilets and kitchen sinks. 

Relating all that back to UNIX design: UNIX pipes are also one-way data flow mechanism, they connect outputs from one place/process to inputs elsewhere.  But try playing with the order of the commands whose outputs get piped - you'll get a HUGE mess. At best - it's not what you're even looking for.

As for UNIX design vs database design - there's a reason those are separate. Yes, it's *not impossible* to pretend that database tables are regular files, and to try and use text-processing tools on them - but the bigger the database, the worse the performance hit, and that was noticed back in 80s.


----------



## Vull (Mar 28, 2022)

eternal_noob said:


> PostgreSQL is nice once you get used to it. But at first, you have to memorize the weird replacements for MySQL commands, e.g.
> 
> ```
> SHOW TABLES;
> ...




```
case sqlmy: $q = "SHOW TABLES"; break;
    case sqlpg: $q = "SELECT table_name FROM information_schema.tables
        WHERE table_schema='public' AND table_type='BASE TABLE'
        ORDER BY table_name"; break;
```


----------



## ralphbsz (Mar 28, 2022)

astyle said:


> Ahh... sort, uniq, and awk work best on the *output* of SQL commands.


But there is a tension here: SQL can do everything that sort, uniq and (simple) awk can do. So you have the option to do everything within the SQL framework, with a coherent set of commands (SQL statements). Or you can split the work between SQL and the traditional Unix tools; now there are two different things to program and maintain. I would think in most cases having a single coherent implementation is preferable.



> As for UNIX design vs database design - there's a reason those are separate. Yes, it's *not impossible* to pretend that database tables are regular files, and to try and use text-processing tools on them - but the bigger the database, the worse the performance hit, and that was noticed back in 80s.


Modern databases have highly complex encoding and compression of the data that's stored in tables. That's particularly true for "big" databases, the ones used in data mining, which are usually stored in columnar format. That allows interesting sorting, delta compression and bit encoding, making it incredibly efficient (I've seen examples where additional rows in a database need less than 1 byte per row, meaning the compression has reached the level of encoding into individual bits).

But that also means that the columnar format files on disk are virtually impossible to understand, unless you use the database's reading code.


----------



## bakul (Mar 28, 2022)

ralphbsz said:


> Modern databases have highly complex encoding and compression of the data that's stored in tables. That's particularly true for "big" databases, the ones used in data mining, which are usually stored in columnar format. That allows interesting sorting, delta compression and bit encoding, making it incredibly efficient (I've seen examples where additional rows in a database need less than 1 byte per row, meaning the compression has reached the level of encoding into individual bits).


One of the reasons I like array languages such as APL & particularly K is for their use in “columnar databases”. It’s a slightly different way of processing data than SQL and more powerful (but less user friendly). I have wanted to write an “array shell” that makes data wrangling as easy as writing shell scripts….


----------

