Show HN: Gun v0.1.0 – The Easiest Database Ever

barakm · on Feb 19, 2015

From the blog:

> Because gun is not a database (NoDB), it is a persisted distributed cache.

This I believe.

> The fatal flaw with databases is that they assume some centralized authority. While this may be the case initially when you are small, it always ceases to be true when you become large enough that concurrency is unavoidable.

Partially true. Though that's not necessarily a "fatal flaw", and calling it such is troubling. Yes, concurrency is unavoidable when you become large enough but you also want your data to be, well, consistent and persistent, but then you go on...

> No amount of leader election and consensus algorithms can patch this without facing an unjustified amount of complexity. Gun resolves all this by biting the bullet - it solves the hard problems first, not last.

Where in the code, pray tell, is it solving these problems? The fact that you also claim to be an AP system (and conflate this with ACID) makes me strongly wonder what your notions on Consistency actually are.

"Just a cache" needs some consistency as well, I'll point out, but you may not care as much about stale reads.

> It gets data synchronization and conflict resolution right from the beginning, so it never has to rely on vulnerable leader election or consensus locking.

From what I'm starting to understand you're, at best, shuffling that off to S3 or "other storage engines" -- you've still got to pay the cost. You can't really claim to do linearizability without, well, actually doing linearizability.

So, maybe it's a cache, sure. And you seem to like to work on the developer API, nothing wrong there. But there's nothing new under the sun and I'm really skeptical that hard distributed database problems are solved in one large JS file.

marknadal · on Feb 19, 2015

Clarification, I said GUN is NOT acid compliant from your "usual" understanding of the term, since GUN is AP. Most people assume acid means CP.

ACID is very vague though, and I'd like to explore it more by writing tests to either confirm or deny whether GUN supports it or not (would you be interested in helping build those tests?). I also want to get some Jepsen like tests up as well.

Data convergence (data sync) is guaranteed by the Hypothetical Amnesia Machine algorithm, which is completely deterministic and idempotent. There is some details on it in the wiki, let me know if you have any questions. I also did a tech talk on it.

In NO way does gun rely on S3 for consistency. That would be horrible. Check out the algorithm and slam me with questions/critiques. Thanks for looking. :)

barakm · on Feb 19, 2015

The problem with AP is that you completely lack consistency guarantees, and most people will agree that's a bad thing. That you advocate that notion so heavily, and without scholarly research makes me really, really nervous.

From what I'm seeing in the function you reference (with very few details in the wiki) there's no accounting for, well, `P` -- what happens when a message does not make it, which happens all the time.

What I see is sort of, if you squint, a vector sequence, but one that takes nothing about actual distributed systems (and their unreliability) into account. It also completely trusts no errors have happened in states it's receiving.

This all through some really messy code...

You also claim to not use clocks, but the first thing you do depends on a clock:

```

var serverState = Gun.time.is();

var incomingValue = Gun.is.soul(deltaValue) || deltaValue;

var currentValue = Gun.is.soul(current[field]) || current[field];

var state = HAM(serverState, ...

```

Where Gun.time.is() does in fact call current time

```

Util.time.is = function(t){ return t? t instanceof Date : (+new Date().getTime()) }

```

So one thing that will break you right away will be clock skew. That's a critique I get simply and for free. I also see nothing about guaranteeing that states applying to each other are actually what they expect to be. Which, you know, would depend on consistency....

I'm very, very skeptical.

marknadal · on Feb 19, 2015

The conflict resolution guarantees eventual consistency, but not strong consistency.

Why did I choose that route? Globally locking data is something the universe doesn't even do, we are bound by the laws of physics and can't get faster than the Speed of Light to transmit information. So you have to make a trade off.

Now, the cool thing is that you can use AP systems to creating globally locking/strong consistency behavior ontop. I do plan for there to be some plugins/modules that handle this for you, as long as you are aware you're moving into either a very potentially SLOW system, or a centralized one.

Also, CRDTs are something that are really important to look into. They're a good reason to try parting ways with CP systems, because they have some fantastic guarantees on data integrity.

We're working (as I previously mentioned) on getting academics and papers involved and written. I don't have that much funding yet, though as it is a very expensive and long winded process, but very important.

Next issue: LARGE DISCLAIMER: GUN is NOT relying on TIMESTAMPS alone to converge data. THE ACTUAL ALGORITHM is a state machine operating within a boundary function, the boundaries are defined by sort values, which I then use a combination of vector clocks and timestamps. But it is not timestamps alone, timestamps have massive vulnerabilities to them.

CLOCK SKEW will NOT break data sync. I've demonstrated syncing happen across machines with bad drift. But you are very right: I need to get docs and videos and evidence of this up ASAP.

Pardon for the caps, I just want people to skim and know for sure that I am not ignorant of these things. I'm not trying to sound angry. I'm super happy people care about these things.

barakm · on Feb 20, 2015

So I tried it. I successfully lost data, and reported conflicting/inconsistent data, with a one-line change to your example to-do list. Instead of generating a key that won't conflict, I force it to write to the same key. Yes, this means it's one list item. That's fine, I just need to make this list item inconsistent between two machines.

And then I simulate a network partition (virtual machine network disconnect. pull the virtual plug. (EDIT: and repair it!))

This is enough to cause a seconds-long lag before the "okay" case comes through. And a local write during that lag? Completely lost. Two windows. Virtual machine. Game over.

This is because your data is overwriting itself, and I don't even care which version I got. I just needed it to not (locally) be the correct one, for any amount of time.

It effectively becomes "last write wins" and, well, there's nothing new under the sun.

Curiouser and curiouser, the "last write wins" happens in about five seconds. Which happens to be approx. the clock skew. (Called it.) There's some timestamp reliance for sure.

It's really easy to break. Caching layer, again, perhaps. Database? Nope.

From some people who actually know what they're talking about:

http://basho.com/clocks-are-bad-or-welcome-to-distributed-sy...

(I also got exactly the overwriting "always win" behavior you discussed in the other comment as being a flaw of vector clocks. From where I'm sitting, this smells like a pretty garden-variety vector clock.)

marknadal · on Feb 20, 2015

Thank you for trying it! You should also post this to the Issues on the repo. I'd like to get some more details though...

On Keys: keys are unique, meaning they can only point to one thing. When you "forced" it to write to the same key, you changed references, your previous data is still stored, but you'll only be able to find it by looking it up by its soul (its ID) or by scanning for it.

Were you experiencing something different than what I just mentioned above?

On Network Partition: The clients store their updates in LocalStorage before they send the peers the update or get an ACK. What caused those peers (tabs) to lose their data?

For clock skew, please see this reply: https://news.ycombinator.com/item?id=9077969 . Most "last write wins" algorithms require a centralized server (a Single Source of Truth) or they diverge on separate machines. That doesn't happen with GUN, you have no SST and they converge (even if it "looks" like last write wins) on both machines in an eventually consistent manner.

I'd like to follow up with these things, either here, over email mark@gunDB.io, on the gitter chat, or on github.

ahelwer · on Feb 20, 2015

What is a boundary function?

marknadal · on Feb 20, 2015

This person (in the comments below, please upvote him), and my reply, best addresses the most important questions (including the boundary function):

https://news.ycombinator.com/item?id=9077969

qqueue · on Feb 19, 2015

Is the Hypothetical Amnesia Machine algorithm backed by any scholarly research, or can you at least cite some papers with similar techniques? Blog posts and javascript are nice, but I have a certain fondness for LaTeX-generated PDFs whenever data integrity is involved, e.g. HyperDex's pretty excellent papers:

http://hyperdex.org/papers/

marknadal · on Feb 19, 2015

No papers yet, but I've been working on building up connection with academics and hiring them. So hopefully expect to see something published, but the process takes a while.

Meanwhile I'm actively working on building towards a simulation system and an actual high-scale deployable battle testing environment. Think of these as taking the theory from scholastic research, and actually implementing them in practical settings that anybody can run.

I'd also like to get a TLA+ specification going. If you have any experience in this stuff, please please please contact me mark@gunDB.io because this is important to me.

roeme · on Feb 19, 2015

When I read your responses, I can‘t shake the feeling I'm talking to some snake oil salesman in a nice suit.¹

And to be honest, GUN‘s docs sound similar. Heavy on how to use, and how awesome everything is, but as soon as one tries to understand stuff, it‘s either WIP or “team up with me/us!”. eyebrow rises

And the claim of “building up connection with academics and hiring them” falls perfectly in line with this. Why the hell can‘t you describe what you did by yourself? If it's so awesome, why don‘t you just die to explain it to everyone who asks? Or, $DEITY forbid, should the “academics” lend some credibility to GUN, even if it's with just their title? What's this HAM about?

Maybe it's just me, but all this with a rather complex naming convention (souls...) and code...

eh, I'll go with “show us teh algoz”. Or describe it.

¹) To illustrate: « I'm actively working [...]» – I'd like to see you passively working.

marknadal · on Feb 19, 2015

Have you looked at the Wiki?

https://github.com/amark/gun/wiki/Conflict-Resolution-with-G...

https://github.com/amark/gun/wiki/How-to-Create-GUN

https://docs.google.com/presentation/d/1VIOJc0bdzUNs7yXMLKCc...

No snake oil. It is a state machine operating over a boundary function. However words like that sound super jargony which sounds vague, despite the fact that people spend their entire lives working on just these problems sets and their nuances.

I'm happy to discuss the workings, and I'd encourage you to try and use GUN and see if it can withstand your concurrency attacks. Challenge accepted?

Edit: This person (in the comments below, please upvote him), and my reply, best addresses the most important questions: https://news.ycombinator.com/item?id=9077969

sseveran · on Feb 20, 2015

Its up to you to prove your algorithm is correct, not everyone else to prove that its wrong.

marknadal · on Feb 20, 2015

If people aren't interested in reading the materials I've provided that go over the concepts and algorithms, there is nothing I can do to "prove" anything to them, they just remain agnostic.

Did you read the link to the other comment and my reply? I'd really appreciate it if you could critique it.

zero_iq · on Feb 20, 2015

You're happy to discuss the workings? How about writing them down somewhere...? All your documentation, such as it is, describes things using terms that you never actually define or explain. Your code is just as bad. Worse in fact, because it introduces yet further terms that are not in the documentation.

Your Conflict-Resolution-with-Guns page simply says 'see gun.HAM' for the explanation. No indication where this can be found. It isn't in the source repository and it isn't in the wiki. A google search reveals nothing.

The 'algorithm' presented on How-to-Create-GUN is meaningless because you don't define any of the return values. I can see how it maps some input values to some output values, but nowhere do you say what any of those return values actually mean, what I should do with them, or why they are useful.

e.g. return {amnesiaQuarantine: true} ... what does this mean? What should be done with that return value? What is an amnesiaQuarantine?

e.g. return {quarantineState: true} ... what does this mean? How does it differ from amnesiaQuarantine: true? What is a quarantineState? How should I react to receiving this return value?

Your documentation says a lot, but doesn't actually define anything, and is ultimately meaningless. This is why people are giving you a hard time and asking so many questions.

Most people reading will not know: what amnesiaQuarantine is, what amnesiState is, what the Hypothetical Amnesia Machine thought experiment is, what a boundary function is (there are multiple definitions - what are you using?), what 'converge: true means', what 'incoming: true' means, what state: true means (given that you say other 'state' variables are times* -- how the hell does a boolean represent a time, what 'you have not properly handled recursion through your data' means. What is a 'soul'? What happens to the data when particular values are stored? Where are things stored? What is the data flow? How are things shared? How does sync happen?

Imagine you don't know what any of your terminology means - like everybody reading your documentation. Treat each term like an undefined variable. Now try to understand your document. You can't. Those undefine terms are never 'set' anywhere. It doesn't make any sense. As soon as it gets close to actually explaining anything it just handwaves, or leaves you with undefined terminology.

You don't define what kind of persistence you implement or what consistency guarantees (worse: your explanations do not seem consistent). You don't define how your conflict resolution works (the 'explanation' given is tantamount to Star Trek technobabble). You don't define how data is transferred. Your slides are useless without any notes.

In your code you say that ACID is vague. It really isn't. Your explanation of how you meet ACID is extremely vague however, using what appear to be truisms and contradictions, and yet more undefined terms that seem to have little to do with anything mentioned in the documentation. Your code is poorly structured, and badly commented. It uses 'cool' sounding gun-related terminology ('shot', 'roulette', etc.) without defining what the hell those things mean. There is nothing in the code that actually seems to do anything with consistency

Your HAM algorithm - the very crux of your system as stated in your documentation, remains unexplained, and WORSE.. has a TODO: comment noting that it might not work and needs further investigation. This comment also mentions rollbacks.... yet nowhere else in the code or documentation says anything about rollbacks, and it's not clear why rollbacks would even be needed according to the (vague) explanation of HAM.

Your further explanations in these comments STILL do not actually describe precisely what HAM is or how it works. If you cannot do this in a simple and elegant manner, then NOBODY will be able to use or trust your database system.

If you want anybody to take you seriously, you must write a simple and concise explanation of HAM, including definitions of all your terms.

Frankly, it is so vague, and so unclear how it works that I am starting to think this is the product of some kind of mental illness...

Sorry to be so harsh, but nobody seems to be getting through to you.

EDIT: I'm reminded of Einstein's quote: "If you can't explain it simply, you don't understand it well enough."

marknadal · on Feb 20, 2015

Terms, defined here: https://github.com/amark/gun/wiki/semantics

Explanation of the conflict resolution in simple terms, here: https://news.ycombinator.com/item?id=9077969 .

Return values, with comments explaining their purpose, here: https://github.com/amark/gun/wiki/How-to-Create-GUN (I know you referenced this already, but did you read the comments explaining each return value? Edit: upon further reading your comment, it looks like you did, you just didn't like them. Perhaps I should make them more concise)

Slides (no audio/video unfortunately) on what operations to apply given the HAM return values, here: https://docs.google.com/presentation/d/1VIOJc0bdzUNs7yXMLKCc...

Persistence, currently S3 or localhost-testing-only disk. Persistence is a plugin.

ACID: Please link me to your favorite explanation of ACID that is clear and concise. I'll try and base my reply off that. I haven't found any good ones. GUN is AP, not CP.

People are taking me seriously, enough that I have contributors and funding. Some people don't take me seriously, and I'm trying hard to open up to them and be honest.

Do I need better documentation? Yes. Do I have documentation? At least some, yes.

What else can I handle for you?

zero_iq · on Feb 20, 2015

You still have not described HAM except in the vaguest of terms. You have addressed very few of my questions.

What is the Hypothetical Amnesia Machine thought experiment? Where have you described this, or where can a description be found? What do the return values mean? The comments are very little help. What situations do they cover?

How does this relate to your algorithm? Please explain the algorithm in simple terms, with precise definitions.

Your slides provide NO USEFUL INFORMATION WHATSOEVER. If you cannot see that someone who doesn't already know what HAM is will be TOTALLY UNABLE to understand your slides, then you have a serious problem seeing things from another's point of view and should get somebody else to do your documentation for you.

Believe me, it's not from lack of trying on my part. I'm not stupid. I'm an experienced developer and familiar with the internal workings of many different database systems. It's my job and my hobby. I have maintained and contributed to several database systems. Your slides are intriguing but meaningless to me.

Your list of definitions ('Semantics') redefines many things that already have perfectly good definitions, and declares new terminology for concepts that already have perfectly good labels.

Many of the definitions are vague or even nonsensical/self-inconsistent.

For example: "soul': is the practically unique, immutable identifier for a node".

OK, so it's an identifier for a Node. So it's a Node ID. Why don't you just call it that?

But what does 'practically unique' mean? Something is either unique, or it isn't. It might be unique in a particular context, e.g. only in one instance of the database, or application, or server, ... or... what?

And what's a 'node'? "A group of no, one, some, or all fields, as they change over time." Well, you've redefined a perfectly good piece of jargon with a new and vague description. Node seems like a really bad word for this. In what way is a set of fields anything like a 'node' in the general sense? How does a node capture things over time? Is it a list, a history, an event log....?

"A group of no, one, some, or all" is better known as a 'set'. This is universally-accepted mathematical terminology. Except you've redefined that too.

And if something is a set of fields.... hey, how about calling it a field set? You know, like everybody else does...? Oh, no, let's call it a node instead....

My favourite: "Sent: proof that a message was received, might contain data that needs no receipt." The more you study this sentence, the more nonsensical and ambiguous it becomes. For a start, why not call it 'Received'? Or even 'Receipt', because that's the common noun for an item showing proof of receipt. Except, that you might need to prove receipt of data that needs no receipt... It is a ridiculous definition.

I'm sorry, but I can't take you seriously.

Frankly, it sounds like you yourself don't understand the domain and concepts you are describing, and are handwaving to cover your lack of knowledge. The fact that you provide your own terminology for things that could quite easily be described in standard terms betrays a lack of theoretical background, and ignorance of the state-of-the-art.

I'd venture a guess that your being REALLY, REALLY bad at explaining things may be correlated with the fact that you're apparently really good at describing tiny things in the most grandiose and self-aggrandizing terms. This seems to be ubiquitous across all your github projects. Redefining things unnecessarily, solving things that already have simple solutions, describing toy apps as radical revolutionary game-changers. I suspect your inability to explain things stems from this narcissism/egocentrism.

karlgrz · on Feb 20, 2015

Abso-fucking-lutely.

karlgrz · on Feb 20, 2015

Yes, exactly this +100.

karlgrz · on Feb 19, 2015

It's not just you.

sseveran · on Feb 20, 2015

I am not quite sure where to begin. I read your wiki. The casual disregard for years of distributed computing research struck me as a bit scary. It would be fine if the website did not claim "Data integrity is now a breeze." but it does. It turns out data integrity in a distributed setting is actually quite a hard problem given the number of boundary cases that exist. For that reason experienced practitioners building distributed algorithms start with proofs, not try to come up with them later.

From reading your page on conflict resolution, which is quite light on details, it seems like you want to have a transaction log but unless the data is purely commutative that is impossible in an AP environment.

If you are going to reference a "Hypothetical Amnesia Machine" it would be helpful to at least define it. Looking at the code you hand someone two versions of their data and ask us to merge it.

It's nice that you are enthusiastic but it might be nice to build some proofs (or use someone elses) before making claims that our distributed state problems are solved.

marknadal · on Feb 20, 2015

This person (in the comments below, please upvote him), and my reply, best addresses the most important questions: https://news.ycombinator.com/item?id=9077969

But to address your other comments:

Why can't a transaction log be commutative? You just have to make sure to be explicit about the ordering of events (like by using linked lists or progressively incremented hashes). This is the realm for CRDTs and stuff though, which GUN core doesn't touch.

Yes the HAM deals with merging any two logs/streams/history, any two snapshot/states, but also merging any log/stream/history with any snapshot/state.

This is important, because it allows you to merge more than just two, you just have to do it serially (in any order). That merge algorithm guarantees the deterministic resolution (see the comment link I posted above).

sseveran · on Feb 20, 2015

It relies on the data being commutative. If the data is not commutative then we need, wait for it, drum roll please, some consistency, which by your own admission is not provided. The consistency is the hard part which is why there is an entire field of study on just this problem.

I fail to see how your merge algorithm is deterministic in case of failures.

karlgrz · on Feb 20, 2015

You -cannot- have a distribute database without consistency. Prove us ALL wrong, please.

marknadal · on Feb 20, 2015

Maybe I wasn't clear... GUN does have Eventual consistency, but not strong/global consistency.

Aka GUN is AP and Eventually consistent. You manually at the application layer can decide to lock, sacrificing Availability, and get strong consistency.

The merge algorithm works in an Eventually consistent case, but obviously is too naive for global Consistency, you'd need some form of consensus.

Or does that not address your comment?

roeme · on Feb 19, 2015

Seconded, I plowed through the “wiki” (not really one) and the code for a bit, but gave up after while.

Doesn't have to be LaTeX for me, as long as it’s a comprehensive documentation.

(When OriFS was introduced here, the papers really helped to grok).

marknadal · on Feb 19, 2015

Did you read https://github.com/amark/gun/wiki/How-to-Create-GUN ? It is only introductory though. The slides might be helpful.

I've demonstrated GUN before handling conflict resolution across machines with significant drift. I need to get a video of this and more docs out on it.

roeme · on Feb 19, 2015

I did, but as you said by yourself, it's short on info.

The conflict resolution is where the meat is, isn't it? And https://github.com/amark/gun/wiki/Conflict-Resolution-with-G... is incredibly hard to read and doesn't really answer questions.

marknadal · on Feb 19, 2015

Yes, that is where the meat is.

Do the slides from my tech talk help at all?

https://docs.google.com/presentation/d/1VIOJc0bdzUNs7yXMLKCc...

I'll be working on getting a recording of the tech talk up, more blogs/documentation on the algorithm specifically. And as others have mentioned, some actual academic papers (but that could be a while).

Anything specific I can address?

Edit: This person (in the comments below, please upvote him), and my reply, best addresses the most important questions: https://news.ycombinator.com/item?id=9077969

theseoafs · on Feb 19, 2015

You do not prove a system is ACID by writing and running test cases.

Does the team behind this have experience with distributed systems?

marknadal · on Feb 19, 2015

Yes, I've spent the last 4 years doing consulting work and research on them.

Warning: I have a very different approach though, more on the side of bittorrent and bitcoin, than what you are going to find in your traditional databases (CP, Master-Slave, Consensus/PAXOS/etc.).

If you are a distributed systems person also, I'd really like to talk. If you're armchair/backseat scoffing, then I would still love to talk show you how the algorithms work.

theseoafs · on Feb 19, 2015

Sorry if you were offended by my question; I didn't intend any disrespect, but I hope you'll forgive me if I say that this HN submission is very confusing. The webpage claims that Gun is both the "easiest database ever" and "not a database" (interesting, then, that the website's URL is gundb). It claims that the problem with databases is that they assume there's a "centralized authority", which is sort of absurd; the overwhelming majority of businesses and organizations that use databases have at least some data that they need to absolutely 100% guarantee is safe and consistent, and distributed algorithms with leaders are the easiest way to capture that. Also, what is "vulnerable" about consensus algorithms? Does your distributed database really have no way to reach consensus?

Persistence is solved with "any S3 like service"? So what does that mean, using Gun is going to tie me to another unrelated Database as a Service that I'm going to have to pay Amazon for?

I'm sure this tool offers something interesting that other tools can't match, since you made it and put time into it, but the existing documentation isn't capturing that yet. Write up another blog post that describes the details, the use cases, the guarantees, etc.; i.e. the actual hard technical details, rather than the PR-speak, and I'll happily take a second look.

marknadal · on Feb 19, 2015

Yes, I got caught red-handed with my NoDB/gunDB marketing speak. Pretty embarrassing, but the point is that it is a distributed persisted cache, so you get the benefits of a DB without having to manage or maintain a DB.

You are right, most businesses that run that type of logic probably have the money to afford configuring master-slave based systems. They probably should not move over to GUN.

No, you do not have to use S3 for persistence, you can also use your disk. Persistence is a plugin in GUN, so you could build your own module that uses anything to store data.

However, there are lots of interesting advantages to distributed/decentralized master-master systems. And I'm trying to make those algorithms available to common man.

Thanks for the encouragement, I'll be adding more docs and blogs and stuff.

Edit: This person (in the comments below, please upvote him), and my reply, best addresses the most important questions: https://news.ycombinator.com/item?id=9077969

fidotron · on Feb 19, 2015

Is this based on CRDTs? ( https://en.wikipedia.org/wiki/Conflict-free_replicated_data_... . . . or ideas similar to that).

marknadal · on Feb 19, 2015

Yes, it is very similar. I'll actually be building some CRDT plugins ontop of GUN core. CRDTs usually deal with specific data types. Interested in helping?

stephanfroede · on Feb 19, 2015

What are "hard distributed database problems"?

I came to the conclusion that they do exist, but I have observed a need to manage millions of tiny write transactions per second-> IoT.

marknadal · on Feb 19, 2015

I've come to a similar conclusion. The smaller your updates are the better, and allow for cleaner data sync. Check out my other reply.

dang · on Feb 20, 2015

A few disagreements in this thread have crossed over into being disrespectful. This is a gentle reminder that you can (and on Hacker News, please do) disagree without calling names.

https://news.ycombinator.com/newsguidelines.html

https://news.ycombinator.com/showhn.html

bitanarch · on Feb 19, 2015

A few questions.

1. How do you define the operating boundaries for your time stamps? What is too low and too high and why?

2. What are the expected use cases for your conflict resolution algorithm? The HAM function you proposed would just overwrite one string with another and so for things like collaborative document editing, user intention isn't preserved.

3. Where is the vector clock defined in your code? I can only see Gun.time.is() in a brief glance at your code... and it is just getting the UNIX timestamp in milliseconds.

marknadal · on Feb 19, 2015

Wonderful questions! Actually some of the best in the entire thread I think.

1. See (3) but first read:

A) The upper boundary is defined by the current machine's local clock, which could have skew or drift.

B) The lower boundary is defined by the last known update on an individual record (down to the UUID+field).

2. The expected use case is for this conflict resolution algorithm is for basic field/value pairs (terms defined here: https://github.com/amark/gun/wiki/semantics, and here: https://github.com/amark/gun/wiki/JSON-Data-Format) within a UUID an object (called a node, as in a node in a graph).

This is what HAM works off and is considered the lowest level atomic pieces (the value). In order to sync on collaborative text you need to build an OT layer on top of this (I plan on doing this, possibly integrating with ShareJS as another mentioned). You cannot collaboratively sync on atomic values by themselves, you must define a CRDT for that - plugins/modules for them will be coming later.

3. Vector Clocks. HAM does not assume what the sort key is for state, it just assumes it is a value it can do <, <=, ===, =>, > comparisons on.

A) Vector clocks have a vulnerability that if you are working with temporary/ephemeral machines, the clocks will constantly get reset and have to play "catch up". However, network partitions are highly likely, so there is no guarantee that two machines won't issue a conflicting vector clock. If this happens, there is no standard way of dealing with this, although there are plenty of work arounds.

B) Timestamps also have a vulnerability, that is if you set your local clock ahead (say 2 years in the future) then it will "always win" wiping out other peers valid values. However you unfortunately cannot determine in an untrusted network whether a peer is being malicious about being 2 years in the future, or if they are actually at a different point in timespace - like a GPS satellite or on Mars, or went offline in the subway.

C) As a result, this is why I combine them together via the boundary function. The upper and lower boundaries of the state machine provide the relative "vector" for the untrusted timestamp in the delta update.

The benefits of this technique are two fold:

1) You get deterministic and idempotent resolution within a special-relativity timeframe in a decentralized system without gossip (consensus).

2) If you do run GUN within your own trusted network, you can use the timestamps to calculate drift between machines and then readjust the boundary function of the state machine. Thus giving you a highly accurate "objective" view of your data across peers, which if the latency is low enough could indicate it is worth creating locks (but thus sacrificing Availability).

Hope this was clear enough! Any questions? I'm going to be reposting this in the rest of the thread.

SlyShy · on Feb 19, 2015

I don't know if I should consider this "the easiest database ever" or "gun is not a database" (from the FAQ). Github says "a distributed, embedded, graph database engine".

I think some clarification around the marketing could do a world of good.

marknadal · on Feb 19, 2015

Good point, oh boy - caught me red handed. shameshame.

What I'm trying to get across is that it is the easiest database because it is not your traditional master-slave database, and it doesn't require maintaining any database process. It is indeed just a cache, but it has all the benefits of a database.

lberger · on Feb 19, 2015

I'm lost. How does it have all the benefits of a database, without any persistence? That doesn't sound like a database at all.

marknadal · on Feb 19, 2015

Persistence is just a plugin/module/hook. Currently it plugs into a very never-should-ever-be-deployed file on disk (for easy local testing only) and S3.

We're going to be adding more storage engines though! Hopefully building an open source S3 that uses fancy algorithms to store on disk and on peers. However I don't know that stuff, somebody else is doing it (or I'm hiring - we're funded!).

mdcox · on Feb 19, 2015

Agreed. I read through the whole site and never saw the word "graph" until I came back and read this comment which puts me a bit on edge. Seems like a good idea, but when the marketing seems to imply a purpose defined via shotgun approach, I start to get wary of a possible hype machine.

marknadal · on Feb 19, 2015

Other than a few contributors, it is basically just me. But I'm hiring! (We're funded).

So TBH, I kinda have to try doing the hypemachine thing to get the word out. :( Does that make me evil?

mdcox · on Feb 19, 2015

Hype and marketing can be great! It's empty hype with nothing to back it up that worries me. I don't think that's Gun, but the "throw buzz word" style marketing (especially if there are contradictions) instantly puts me on guard. If Gun solves a problem, then by all means get the word out about it any way you can!

nolanl · on Feb 19, 2015

Interesting project! It seems to share a lot of the goals and design choices of PouchDB/CouchDB: distributed, offline-first, eventually consistent, deterministic conflict resolution, etc.

One big difference I can see is that it's only using LocalStorage, which has good cross-browser support, but only allows 5-10MB maximum. Are there plans to add IndexedDB/WebSQL support so that users can store more data?

marknadal · on Feb 19, 2015

Yes! Thanks.

LocalStorage implementation is just the default plugin, and I chose it first because of its compatibility. I'd like to get IndexedDB support in there as well. Interested in helping?

_pfxa · on Feb 19, 2015

I wanted to read the text on the page, but the styling, with the shadow, or the gloss, or whatever it is, it is giving my astigmatic eyes pain, so I couldn't, I'm sorry.

sanderjd · on Feb 19, 2015

Ha, yeah, I thought my eyes were broken. Turns out it's using the `text-shadow` css property. Much more readable after I turned that off.

marknadal · on Feb 19, 2015

sorry about that. IDK why but it makes it easier on my eyes, maybe I should do a survey (I'm probably just weird) and then fix it.

adambard · on Feb 20, 2015

I definitely remember complaining about this exact thing a year ago :P. At least you toned down the shadow a bit.

marknadal · on Feb 20, 2015

awwwe you remember me! Happy face! For... being "that guy" that had blurry text. Shoot, sad face. Thanks for sticking around :).

adambard · on Feb 20, 2015

Heh, I remembered the project too, I was just reminded of the blurry text by the blurry text.

If you find a high contrast hard on the eyes you could drop it a bit by just making the lettering a lighter grey in leiu of the drop shadow. Just don't overdo it or you'll get people complaining about that.

evilduck · on Feb 19, 2015

text-shadow: 0px 0px 7px #DDD;

That 3rd number is a 7px blur radius starting from a 0,0 point and #DDD is light grey, blurring between solid black and solid white. I think you're just weird, very few people consider blurry text preferable.

ArekDymalski · on Feb 19, 2015

It looks very promising. However I wonder who is the intended user of Gun:

1. "full-stack" developers who just want to save time and/or benefit from NoDB aspect 2. Beginners and front-end developers who don't anything about databases?

In case of group 1 your marketing seems to be insufficiently technical as many people here have already noted.

In case of group 2 (which I belong to) things look completely different. As a beginner whose learning efforts are constantly disheartened by tutorials and courses which end at "locally hosted HelloWorld app" phase, I'd be more than happy seeing: 1. step-by-step, layman-friendly tutorial on installing Gun on S3 and other platforms. 2. very well commented example app demonstrating how to create typical functionalities.

With such approach you will keep the "Dropbox of databases" promise which sounds very exciting. Actually I think that something like this should be an obligatory feature on Codeacademy or any web development MOOC dedicated to beginners.

marknadal · on Feb 19, 2015

Great questions.

1. As of right now, focusing on beginner/front-end devs who just want an easy open source Firebase like database. People building small experimental apps, since we have finished our battle-testing suite yet.

However, I'd also highly encourage full stack developers to get involved and try it out and give us feedback. For small projects it'll probably save you time, but the plugin/modules ecosystem (aka features) aren't mature enough that you'll be writing a lot of your own logic. Which please do! We need them!

If you don't want to run GUN on localhost, I'll host a GUN server for you. :) You are right, I need to get better docs/tutorials and information out on this, so laymen don't get disheartened.

Is there anything I can do to help? Thanks for your comment!

ArekDymalski · on Feb 19, 2015

Thanks Mark, I'll keep an eye on the docs page then. I'll also keep the thumbs up :)

rawnlq · on Feb 19, 2015

Have you heard of sharejs http://sharejs.org/? It's made by an ex-Google-wave engineer and uses operational transforms for eventual consistency. It seems like you guys are solving similar problems.

I mention this because Dropbox has their own "Dropbox for Databases" called Datastore: https://www.dropbox.com/developers/datastore which is based on Operational Transforms: https://blogs.dropbox.com/developers/2013/07/how-the-datasto...

marknadal · on Feb 19, 2015

Actually yes! I'm one of the people who accidentaly sparked a long discussion in the #1 issues thread: https://github.com/share/ShareJS/issues/1 that I've seen other people on HN link to.

GUN doesn't have OT-style text collaboration yet, so go with ShareJS if that is what you need now. I do plan on implementing it on top of GUN though, or trying to get ShareJS integrated with GUN. Joseph is a really great guy.

Yupe, I've talked to Steve Marx at Dropbox Datastore at a hackathon before. He's a great guy as well. They're using algorithms that require some centralized conflict resolution though. Which is great, but I'm interested in the decentralized side.

theseoafs · on Feb 19, 2015

I dislike that the webpage actually has very little information about what the tool does, what use cases it is suitable for, what the architecture is like, etc.

Here's an important question the homepage doesn't answer: is it ACID?

marknadal · on Feb 19, 2015

Good point, I'll try and move the blog to another page and replace it with more details.

The fastest summary is that it is an Open Source Firebase.

Flat up answer for ACID: honestly, not how you traditionally would think, as it favors AP of the CAP theorem.

However, ACID terminology is actually pretty vague (http://en.wikipedia.org/wiki/ACID). Here is my comments about ACID in the code:

			A - Atomic, if you set a full node, or nodes of nodes, if any value is in error then nothing will be set.
				If you want sets to be independent of each other, you need to set each piece of the data individually.

			C - Consistency, if you use any reserved symbols or similar, the operation will be rejected as it could lead to an invalid read and thus an invalid state.
			
			I - Isolation, the conflict resolution algorithm guarantees idempotent transactions, across every peer, regardless of any partition,
				including a peer acting by itself or one having been disconnected from the network.

			D - Durability, if the acknowledgement receipt is received, then the state at which the final persistence hook was called on is guaranteed to have been written.
				The live state at point of confirmation may or may not be different than when it was called.
				If this causes any application-level concern, it can compare against the live data by immediately reading it, or accessing the logs if enabled.

If you have any specific further questions I am happy to answer. It has support for vector-clock/timestamp "state" transactions.

atombender · on Feb 20, 2015

I don't think ACID terminology is vague at all, and it sounds like you're trying to fit a square peg into a round hole here, terminology-wise.

Atomicity (A) means that changes must be committed or not committed, "all or nothing". If you commit the change set (X, Y) then upon successful commit, both X and Y must be present; if either X or Y are missing, it's not atomic. Conversely, if the commit fails, no changes may have been made.

Consistency (C) means that data is always valid, according to whatever rules are imposed by the data model. For example, classical RDBMSes enforce referential integrity (aka foreign keys), "not null" constraints, unique primary keys, etc. Consistency is the guarantee that every update conforms to these rules; a transaction cannot be committed if it doesn't. Consistency has nothing to do with conflict resolution (although in a concurrency environment, you do need both).

Isolation (I) means that one transaction must create the illusion that it is isolated from all other transactions, as though all transactions were applied serially. Any concurrent commits during the transaction must not be visible to it. Most databases implement a less strict level of isolation by default that is often called "read committed"; the transaction can see any changes from parallel transactions that are committed during the transaction (which means that a query may return different results if run multiple times), but it will not see uncommitted changes from other transactions. Many databases do implement the "serialized" isolation level, and will fail if you try to do execute two conflicting transactions at the same time.

Durability (D) means that transactions must remain permanently stored after they are committed. This is pretty much the vaguest rule, since there are too many variables in real life: it doesn't say anything about redo/undo logs, RAID caches, etc.

It should be added that ACID makes the most sense in situations where you combine multiple updates in a single transaction. ACID is of course useful for single-key, or single-object, updates, but it really comes into play when you have longer-running aggregate updates that need to perform both reads and writes across a bunch of different sets of data.

akerl_ · on Feb 19, 2015

I'm attempting to draw a connection between your comments on ACID and what ACID actually means, and there doesn't appear to be any parallel.

marknadal · on Feb 19, 2015

I understand what you mean.

Could you do me a favor and point me to your favorite description of ACID?

tomphoolery · on Feb 19, 2015

> 400 Bad Request

https://github.com/amark/gun

marknadal · on Feb 19, 2015

oh snap, the HN "DDOS" has peaked! I'll see what I can do to get things back online. Thanks for putting the github link in here in case others get the same issue.

karlgrz · on Feb 19, 2015

I would really love if this actually worked as promised. Way too much skepticism and not nearly enough proof. Kudos for actually putting this out there, though. It'd be great to prove everyone wrong, but I will not hold my breath.

Good luck!

marknadal · on Feb 19, 2015

You can try messing with it yourself by, doing (if you already have node/npm/git installed and familiar with terminal):

   git clone http://github.com/amark/gun
   cd gun/examples && npm install
   node express.js 8080

Then open it in a couple of browser tabs on different devices, change their system clock, try refreshing data, crashing things. etc.

I'm also trying to figure out how to write simulated tests (like Jepsen) that will do all of this for you and give you the results of what failed/succeeded. Till then, let me know if you see anything break.

karlgrz · on Feb 19, 2015

I'm not going to spin this up on 1000 nodes to make sure it handles the kind of load needed to simulate actual production traffic (which is what you would need to actually figure out if this would hold up to some kind of large scale load that Riak or Cassandra would be able to handle). Maybe you should do that yourself and document it to prove how good your product is!

marknadal · on Feb 19, 2015

You don't need to spin up a 1,000 nodes.

You can just spin up a 1,000 tabs.

Since they all run the same algorithm!

Yes, I am working on more tests to prove myself wrong or right. Please bare with me as I/we make progress, because it is literally only a few contributors and me.

This is v0.1.0 for a reason, not v1. Lots ahead, but please play with it while we work on developing the test suite.

karlgrz · on Feb 19, 2015

I appreciate the suggestion and response. Understand it is v.0.1.0 but you should also understand that when you bring something like this out with next to no academic backing behind your theories and algorithms there is DEFINITELY going to be skepticism and doubt.

You are exactly right, though. It's early, and I'll give you the benefit of the doubt that you will achieve what you want.

Also know that there is a TON of research in these areas (which you clearly are aware of based on your comments in this thread) that basically refutes a lot of what you are claiming. I would love to see more clear documentation along with actual proofs showing how your algorithm is sound.

Until then, good luck, and I look forward to hearing about your success!

marknadal · on Feb 19, 2015

Thanks! :)

Quick question though: I'm claiming an AP system, not that I have all three. What research are you referring to that suggests you can't have idempotent/deterministic conflict resolution? CRDTs are out there in the wild and working. Do you have any papers in mind?

karlgrz · on Feb 19, 2015

I'm not saying you claimed to have all three.

Only paper I would have in mind is the CRDT paper from Letia, Preguiça, and Shapiro which I'm sure you're already familiar with.

The thing that bothers me the most is that it appears your entire algorithm (Hypothetical Amnesia Machine) has no proofs behind it. Specifically, your wiki article here:

https://github.com/amark/gun/wiki/Conflict-Resolution-with-G...

Has a giant hole where the substance would be. That bothers me because you are putting this potentially cool thing out there WAY BEFORE you have done the actual work.

Again, I applaud the fact that you actually put this together and you implemented it. And I understand it's v.0.1.0. That's fine.

Claiming this: "All conflict resolution happens locally in each peer using a deterministic algorithm. Such that eventual consistency is guaranteed across all writes within the mesh, with fault tolerant retries built in at each step. Data integrity is now a breeze."

without any proof that algorithm actually does this reliably and WITHOUT DATA LOSS bothers me. There is so much snake oil out there, you don't need to be starting off on the wrong foot.

I'm no expert at this stuff (I've only been working on distributed systems for about 5 years) but I'm also not claiming to be an expert. I just know that there is a lot of hand waving out there, and I think it would be important to actually prove your algorithm.

My 2¢.

marknadal · on Feb 20, 2015

This person (in the comments above/below, please upvote him), and my reply, best addresses the most important questions: https://news.ycombinator.com/item?id=9077969

Please don't assume I haven't done the "actual work", I have. The academic side of the equation with proofs is going to take much longer than the timeframe from my investors for this seed round. I openly admit that, but I'd rather do good of getting this out in peoples hands to actually play and build stuff with.

To be honest, I'll probably want to get Jepsen tests and the sort built before the academic side of the equation is completed. Thank you for being skeptical (I like that), but please don't ignore or not experiment with something just because a paper hasn't been published yet. Who knows, if you did play with it, you might like it enough to help write the paper - but maybe that is me being too optimistic.

Blessings.

karlgrz · on Feb 20, 2015

That is good information, thanks for that.

danbruc · on Feb 19, 2015

There is no way this can work. Merging data changes can inherently not be automated in the general case. Deciding if a change from foo to bar should win over a change from foo to baz depends on the semantics of those strings. There are some cases, for example counters, with simple and clear semantics where you can build reusable and robust solutions for. You can also handle the general case with simple policies like last write wins. But there is no way any algorithm will ever be able to figure out whether to choose bar or baz, not at last because I could arbitrarily declare any of the two outcomes correct.

marknadal · on Feb 19, 2015

Every change can be preserved in a history/append-only/log/stream. So you don't have to "lose" data from another "winning". However, the algorithm will by default select one, you can then code it at the app level for the user to select a new winner.

The general case here is very UUID based key/value pairs. Anything beyond that, you should be using CRDTs and OT like algorithms, which I will be building on top of GUN.

However, in the meanwhile, I challenge you to try running the example folder from the GitHub ReadMe and seeing if you can break the automated sync and cause data divergence!

Edit: This person (in the comments below, please upvote him), and my reply, best addresses the most important questions: https://news.ycombinator.com/item?id=9077969

nathan7 · on Feb 20, 2015

Awesome to finally see Gun on here, Mark! What still worries me is the reliance on external storage services, although a good local storage service could be built for Gun. Other than that, I'm glad to finally see docs!

stephanfroede · on Feb 19, 2015

Cool approach. I had some fights with Neo4J and taming IO. I did fall back on a 2nd Level Cache, which is nothing else than a huge hash map/KV store in memory.

marknadal · on Feb 19, 2015

Thanks! Interested in joining and working on these types of problems? You seem to have some pretty good skills. Shoot me an email mark@gunDB.io

glittershark · on Feb 19, 2015

You guys seriously need to work on your SEO - Googling "gundb" has the page show up with the text "Your browser does not support frames...".

marknadal · on Feb 19, 2015

Oh my goodness #fail. Thank you for spotting this. I'll try to figure out how to fix it (probably by not being cheap by domain masking).

fiatjaf · on Feb 19, 2015

If it "is just a cache" and the data is distributed among every client, where is the data at each time before it is persisted to S3?

marknadal · on Feb 19, 2015

Great question, I'm going to C&P a reply I did previously:

1. In memory in the browser tab's process.

2. If available, in the browser's localstorage or fallback.

3. In the server process's memory.

4. If available, on disk in the server.

5. If in a multi-machine setup, any other connected server that is subscribed to that data set, being in memory (3) or in disk (4) if available.

6. If configured, in a machine log on S3.

7. Persisted to S3, which replicates and shards it for you internally.

8. If configured, in a revision file on S3.

9. If configured, in a multi-region S3 setup, redundantly in many places.

(2) is not cleared till an acknowledgment that (7) is confirmed. (1) is not cleared until an acknowledgement that (7) is confirmed or if the tab is exited. In the case of (7) it is no longer the delta/diff, but a snapshot of that current data set with that delta/diff's update. Retries from (1) ~ (5) will happen at various events, if the confirmations are not satisfied. If a conflict has already occurred by (3) the acknowledgement from (5) will include a notification that the value has already been updated, along with the standard delta/diff of that conflicting update being sent down. Meaning (5) does not guarantee that your delta/diff has "won", only that it has been saved or is already outdated.

Worst case condition is that (2, 4, 5, 6, 8, 9) are turned off, in which your user's data is as volatile as them preemptively leaving the page (although I suppose you could use an onbeforeunload to warn them) - however this behavior is the current norm for most http post based forms and apps. Actually, pardon me, worst case condition is that everything is offline simultaneously, however this is not really interesting because then users won't even be able to access your app in the first place.

kainolophobia · on Feb 19, 2015

I've looked at your "Hypothetical Amnesia Machine algorithm" and have a few questions.

First though, I'd like you to read this: http://research.microsoft.com/en-us/um/people/lamport/pubs/t...

marknadal · on Feb 19, 2015

Yes, I've looked at this paper before - I should reread it though.

I've done a tech talk (not recorded though) on the pros/cons of vector-clocks and timestamps. I have some very specific insights which I should probably write a paper on. Or at least get the tech talk recorded or written down. There are some slides at the bottom of: https://github.com/amark/gun/wiki/How-to-Create-GUN .

What questions may I answer?

fiatjaf · on Feb 19, 2015

The demo is a little confusing because it loads two iframes and seen to be faking it, but yes it works and no, it is not faking it.

https://dl.dropboxusercontent.com/u/4374976/gun/web/tabs.htm...

marknadal · on Feb 19, 2015

Thank you for noticing this. :)

My original tutorial actually required the user to physically open up multiple tabs and have them be side-by-side. However it was a mess and people didn't like it. So I opted to fake it... while still depending upon the real tech underneath.

HOWEVER, it is just running on a freebie heroku box, so it is probably bound to crash/fall-over soon.

lux · on Feb 19, 2015

As a "self-hosted Firebase", I'd love to see something like their integrations with various JS frameworks, for ex:

https://www.firebase.com/docs/web/libraries/react/

marknadal · on Feb 19, 2015

YES! We're actively working on trying to get adapters built for React, Angular, Ember, Backbone, etc. but we're a super tiny team.

Would you be interested in contributing? You could really help make a big difference.

lux · on Feb 19, 2015

Awesome! I've starred the project on Github. I'm in startup mode and juggling way too many things these days, but maybe I can find a free evening :)

marknadal · on Feb 19, 2015

sweet, shoot me an email mark@gunDB.io to talk more.

marknadal · on Feb 19, 2015

Hey everyone! If you have any questions, I'll be here for the next several hours. Also check out the GitHub Wiki: https://github.com/amark/gun/wiki .

bhz · on Feb 19, 2015

Have you tried redis?

http://try.redis.io/

marknadal · on Feb 19, 2015

I love redis! My first proof of concept of GUN used redis as the persistence/storage layer. But I moved off of it since I wanted a fully embedded solution.

Data wise the difference is that Redis doesn't support graphs. But you could easily build that on top of Redis, so you could argue GUN is just graph data ontop of Redis (well, not anymore) with a conflict resolution algorithm baked in.

bhz · on Feb 20, 2015

I'll have to give Gun a go when I have the chance. Thank you for providing the contrast.

Yadi · on Feb 19, 2015

Hey Mark! Congrats, it looks awesome, good to see this here!

trithagoras · on Feb 20, 2015

...Is there a glow effect around all the text?

protomyth · on Feb 19, 2015

Congrats. What is the license? I must be missing where it is and the source code I checked doesn't have it.

marknadal · on Feb 19, 2015

Thanks!

Honestly, I might put this up to an open-source vote.

I personally learn towards the MIT and the ZLIB license, http://en.wikipedia.org/wiki/Zlib_License .

However I also know a lot of other databases are doing AGPL, I think for monetary reasons. Which :/ I might also want to consider.

But as I said, I honestly think this should be a combination of community decision.

Could people reply back with what license they'd like?

wongarsu · on Feb 19, 2015

If you want everyone to use your database, MIT or ZLIB are clearly superior. For you(r company) that would limit your monetization options to support and similar, which is certainly not ideal.

If you value free software (as opposed to open source), AGPL is a good option and allows you to sell more permissive licenses to everyone who needs one.

If you actually want to make money with this, it's really a question of your business model. I would use it with either license.

lclarkmichalek · on Feb 19, 2015

Why would using AGPL imply not valuing open source?

jackbravo · on Feb 19, 2015

Because open source guys value having more people using your code, and using AGPL discourages some people from using it because they can't keep their modifications private?

samuelcouch · on Feb 19, 2015

This is really awesome! Excited to use it.

_wiv7 · on Feb 19, 2015

"With all new flavors like banana, fizzbitch, and GUN!" https://www.youtube.com/watch?v=t-3qncy5Qfk

sigmonsays · on Feb 20, 2015

This is the best troll ever

curiously · on Feb 19, 2015

this seems like a great tool. hosting firebase on my own is what I want to build real time apps, is this possible?

I have the same concerns for meteor which I have for this as well, which is security and scalability.

How does Gun address those two things?

marknadal · on Feb 19, 2015

Great questions.

Hosting on your own: Yes.

Security: Currently a "Roll Your Own" approach, where you wrap GUN behind some firewall/throttling like system.

Why? Because permissions are so app-specific behavior, I haven't figured out how to generalize it. I don't think it is possible to do it, so in the future we'll probably provide various security plugins that come with app specific assumptions.

Scalability: Run the example folder in the GitHub Readme, and open up hundreds of tabs. Gun is running individually in all of them. See how it handles that.

I'm trying to have a production-grade battle-testing suite developed soon, such that you could just run a script, it would ask you how much you want to spend on the test, and then it would deploy a ton of GUN peers to the cloud and generate a ton of traffic and load. This is not available yet, but something I'm focusing on within the next 6months or year.

Anything I can help with?

lilyball · on Feb 19, 2015

Why is it called Gun? The name is a little off-putting. What's next, a database called Kill? How about Murder? Genocide?

Edit: The fact that I'm being downvoted for voicing a concern about the naming is really disappointing. This is a serious issue, and I would appreciate a response, not being buried.

tlrobinson · on Feb 19, 2015

The fact that you're being downvoted suggests most people strongly disagree that this is a "serious issue".

sp4rki · on Feb 19, 2015

Why is it off-putting for you? What do you have against an inanimate object?

Kill? Murder? Genocide? Guns don't kill, murder, or commit genocide. At least I've never seen a gun go on trial for any of those. People on the other hand...

Anyways, you're probably getting down-voted because people here dislike politics. You're trying to inject a political issue into a technical one.

lilyball · on Feb 20, 2015

You can't use a politically-charged term as a name and then claim that it's purely technical. I'm not trying to inject a political issue in here. And in fact I don't even care about the politics. But what I do care about is the fact that naming a product "Gun" is quite distasteful and, as a result, I will go out of my way to avoid using it.

sp4rki · on Feb 20, 2015

I never claimed the name is "technical". The product itself if though, and instead of discussing the product itself you're expressing your opinion on the distastefulness of the name as fact. You might say you're not trying to inject a political issue, but you sound just like that guy that said "darkmail" is racist against white people.

lilyball · on Feb 20, 2015

Your argument seems to be that everybody should just ignore the name and focus on the product. And that's bullshit. Names are important, they have meaning, and they cause reactions in people that see them. To claim otherwise is being willfully blind.

sp4rki · on Feb 20, 2015

No, my argument is that "gun" is not "rape", "terrorism", or any other socially unacceptable word that's linked to morality and the ethically correct. You know what's bullshit? You saying you have absolutely no political intentions regarding your opinion.

You obviously live in the US. You obviously feel strongly about gun control. Anything anywhere can be found to be offensive to someone. As such we all need to have a little restraint and realize that if we start policing people because they gave something the made a name you don't like you're contributing to the problem instead of the solution.

lilyball · on Feb 21, 2015

I did not say "rape", I did not say "terrorism", I did not say anything about morality or ethics or politics.

You're making some pretty big assumptions as to the motivation for my comment, and they're wrong. I do not appreciate having my concern trivialized or dismissed as "political" when it's nothing of the sort.

My issue with the name "Gun" is that it's violent imagery. It has nothing to do with gun control, or anything political. It's about the glorification of violence. Our culture is already oversaturated with violent imagery. Objecting to unnecessarily violent names is not the problem. Immediately dismissing anybody who raises a concern about naming is the problem. You are contributing to the problem, not me.

sp4rki · on Feb 23, 2015

You might not have said the words (and I never said you did, for the record) but you're giving the word "gun" an equal connotation as per your "kill" and "genocide" references. So, you say it's wrong. You say it's violent. That's your prerogative and you're entitled to it. You can dislike the name if you wish, but it's insulting to attack other people's choices of a product name when it doesn't violate any valid and widespread social moral constructs.

So now I'm the problem you say? Well, I could say the same exact thing about _you_. You're the type of person that will take the freedom to own firearms and the right to defend yourself from the people. Guns to me mean security and discipline. My life has been saved by myself or a person wielding a gun on multiple occasions (no deaths mind you), and I think that while the human race is littered with despicable human beings it's incredibly irresponsible for anyone to try to take away our right to defend ourselves.

I'm not dismissing your concerns, and I haven't down voted you either because even though we disagree wholeheartedly I find it's a valid discussion topic. I truly don't mind your criticism and can at some point respect it. What I don't respect is your super heightened moral compass and to a point denigrating a product and it's creator because it doesn't fly with what you believe, either morally and/or politically.

lilyball · on Feb 23, 2015

> but it's insulting to attack other people's choices of a product name when it doesn't violate any valid and widespread social moral constructs

That's complete bullshit.

> You're the type of person that will take the freedom to own firearms and the right to defend yourself from the people

What on earth are you going on about? I have never once expressed an opinion on gun control, yet here you are attacking me for a position that you've entirely manufactured in your mind. It's entirely possible to support the right to own and use guns while still decrying the overabundance of violent imagery in our society.

> denigrating a product and it's creator because it doesn't fly with what you believe

When did the mere questioning of the name of a product suddenly become equal to publicly denouncing a product and castigating its creator? You, and everybody else who've been responding, are acting as if it's some great crime to express a dislike for an unnecessarily violent name. I have to assume this overly-defensive behavior is actually a reflex to defend the term "gun" rather than anything about this particular product.

sp4rki · on Feb 23, 2015

> That's complete bullshit.

Why? Because you say you say so? You got down-voted because you equated the word "Gun" to "Murder" and "Kill". That is what I call bullshit, and fear-mongering, and just plain old FUD.

> What on earth are you going on about? ...

Acting as if using the term "Gun" is a negative because of it relation to violence does a disservice to any and all anti anti-gun movements. There is no way around it. You might not believe in gun control yourself (and that's irrelevant in any case), but this attitude is nonetheless part of the problem. People should be taught to understand and respect guns, not to fear them implicitly.

> When did the mere questioning of the name of a product ...

You didn't just express the dislike of the name, which by the way would have gone down better with me and most probably with everyone else, but compared it with a totally over the top "names" that do a have a definite negative connotation.

You say we're overly defensive, but that would be incorrect. Most people just down-voted you and moved along. I decided to go ahead and tell you why you got down-voted and explain why your comment goes down in a sour manner, after which you decided to defend your position at any cost. The only one with a defense reflex and overly defensive behavior is you.

lilyball · on Feb 23, 2015

What do you think you're accomplishing here? You are overly defensive. You're putting words in my mouth, attacking straw men, and just generally doing everything you can to try and protest the very simple claim that the word "gun" is unnecessary violent imagery in this context. I hope you aren't expecting to convince me that you're in the right with your behavior, especially when you aren't even addressing the core point I've repeated (that violent imagery is not appropriate in this context).

If you want to engage in bad rhetoric and repeatedly attack me over my comment, that's your choice. But to then claim I'm being overly defensive because I respond to your attacks, that's just nonsense. But you do make a good point, which is that I'm not required to respond to you. If you wish to have a reasoned discussion about names and violent imagery and what contexts it is and is not appropriate, I will be happy to talk. But if you continue to respond in the same vein as you have so far, then do not expect a response from me.

sp4rki · on March 2, 2015

The fact that you believe I'm attacking you tells me all I need to know. I'm not protesting a single thing. You are. I'm telling you why people - myself included - do not agree with you. There is no violent imagery besides the one you want there to be.

It's ok though. Don't respond, this is obviously not going anywhere and we'll just agree to disagree. Cheers.

marknadal · on Feb 19, 2015

I didn't downvote you, so please don't think I'm the one trying to bury you.

I'm calling it GUN because it is powerful and therefore a dangerous tool to wield. Because I'm going with a fully decentralized/distributed system, it has also generated some controversy with people.

Fact is, centralized/master-slave consensus based databases are incredibly popular right now. Things like Riak, Cassandra's CRDTs are not getting enough traction as they should - but probably because they can be difficult to set up. I'm trying to blow this all out of the water and make distributed database systems easy for developers.

So I'm admittedly going for an edgy name. I'm not wanting to kill anybody, just centralized software.

lilyball · on Feb 20, 2015

Thanks for the response. I didn't think you were the one trying to bury me, but I appreciate the fact that you care.

I'm glad to hear that you are aware of the fact that this is a loaded term and that you intentionally chose it because you wanted an edgy name. While I'm still not a fan of it, I feel much better about it knowing the reason behind the naming. And I think you need to put this info somewhere on the site and the GitHub project. I read gunned.io, and I skimmed the README of your GitHub, and nowhere did you even acknowledge that the name was edge, much less indicate that this was an intentional choice. I would urge you to add a FAQ entry on gundb.io, add a wiki page to your GitHub repo, and put a line somewhere in the README (perhaps at the bottom) linking to that wiki page. Otherwise, you're going to end up with more people than just me thinking that you chose a potentially-offensive name as opposed to a deliberately edgy one.

Speaking of your GitHub repo, you should also add gundb.io as the webpage for the repo, and probably link to it in the README.

acjohnson55 · on Feb 23, 2015

For what it's worth, I second the concern. The name probably wouldn't prevent me from using it, if it's the right tool for the job, but I definitely find it distasteful. And it's rather incongruous with the very friendly personality you exhibit in dealing with some pretty pointed criticism in the threads here.

I can just imagine the dialog:

  Me: Hey let's check out Gun for this.

  Team: What's that?

  Me: A distributed cache data store.

  Team: Why's it called Gun? That seems kind of violent. What does that have to do with distributed cache?

  Me: Hell if I know.

Perhaps it would be more effective to choose a name that captures the decentralized aspect of the system?