DBeaver 2.0.7

DBeaver 2.0.7 has been released.

Changelist:

  • Apache Cassandra extension plugin
  • ResultSet cell properties
  • Inline integer value editor fixed
  • Inline binary editor added
  • MySQL stored procedures editor fixed
  • Oracle XML editor fixed
  • PostgreSQL big schema loading fixed (performance increased)
  • SQL generation improved (MERGE operator)
  • Date viewer performance increased
  • A few UI bugfixes

It is last planned 2.0.x version. Next release will be 2.1.0 and it will contain many improvements concerning result set view/edit (like columns reorder, pseudocolumns support, etc).
Also we are working hard upon NoSQL and Bigdata databases support. It is a serious challenge as we have to make proper data model and UI for them.
Read more about nosql/bigdata:

NoSQL/Bigdata problems

Currently DBeaver supports Apache Cassandra data browser but it has quite many limitations. Besides Cassandra we have plans to support HBase, MongoDB, Lucene and may be some other engines.
Generally I hope to receive feedback on current Cassandra extension, maybe some new UI ideas will appear.
Main problems with NoSQL/Bigdata databases data browse are:

Poorly structured data

While relational databases have very strict data structure (tables, columns, constraints and foreign keys) NoSQL engines doesn’t. In simplest case noSQL “table” is just a hash map where key and value are byte arrays with application specific content. And there is no portable way to visualize this has map as there is no information about actual data types.
Sometimes we can assume some structure (e.g. Cassandra has schema meta information) but real data may be completely different. Each data “row” (or “document”) may have any number of “columns” (fields/attributes) of any type. And saying “type” I mean not some real type like “string”, “integer”, “date” but just a byte array which is a result of application specific serialization. That’s really disgusting if we talk about some universal approach.
We are not trying to fit any kind of data in a standard grid – obviously it is not possible. Even in case of relational databases like Oracle we have to deal with object types, arrays, strucutres, references, etc – these types can’t be visualized in a single grid cell. To solve this problem we added new feature in DBeaver 2.0.5 – cell viewer panel. So current result set UI is some kind of three-dimensional grid. It’s not a panacea but it solves a lot of problems. In future we are going to make some even better UI solution.

No universal API

For relational databases we have our excellent JDBC (it has disadvantages and lack of many features but in fact it is awesome that we have universal API which is supported by absolute majority of database vendors).
For NoSQL we have nothing. Each vendor has it’s own proprietary API, it’s own approaches and data model. It is not bad by itself because each NoSQL solution was designed for some particular purpose and at last these solutions are very-very different. From the other hand all NoSQL solutions have much in common. Most abstractions are compatible with each other (document ~ row ~ object ~ multi column ~ etc). So I belive that it is possible to invent some universal API for any kind of data sources (as I understand Google AppScale and Apache Hadoop Hive projects tries to do something in this area). And universal query language for them (like UnQL).
DBeaver Cassandra extension uses Cassandra CQL JDBC driver to deal with data and queries. It is enough for simple data browsing (although it requires a few tricks with JDBC API) but in fact it is just a workaround. JDBC wasn’t designed for this.

A lot of data

Bigdata solutions was named Bigdata for a reason. We have to deal with a lot of data.
Generally relational databases may store a lot of data too (billions of rows, petabytes of data, etc) but talking about UI we have to visualize only a few rows at one moment and each row can’t be too big (for sure it may contain LOBs which are visualized separately). In NoSQL (Casandra in particular) each row may contain millions of columns and it is normal situation in Bigdata world. General problem is that it is hard to distinguish some atomic piece of data which

  • can be visualized on the screen
  • small enough to fit in memory
  • big enough to be sensible

Currently DBeaver fetches entire row from Cassandra and shows it in “3D grid” but if you have too many columns in rows then it will fail with OutOfMemory error. Hopefully we’ll find out a way to deal with this problem in some future version.

– Serge Rieder