Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DB Api #2182

Closed
wants to merge 26 commits into from
Closed

DB Api #2182

wants to merge 26 commits into from

Conversation

bcardiff
Copy link
Member

This PR is an initial effort to add a common db api to crystal.

Documentation has been add here and it goes trough a usage/implementation story.

There is already a forked implementation for sqlite and mysql that (partially) implements mysql's binary protocol (no dependency!)

It currently lacks some important features, but I bet is worth sharing it now in order to discuss and get feedback.

Not implemented features:

  • Connection pool
  • Time data type
  • Direct access to IO to avoid memory allocation for blobs.
  • Transactions

Please, highlight if there is a feature you think is a MUST for an initial version of a db api.

Brian J. Cardiff added 24 commits February 18, 2016 18:20
dummy driver that parses the query
initial type support for result_sets
expose list of types to support by the drivers implementors.
deal with nilable types with `#read?(T.class) : T?` methods.
change `#has_next` to `#move_next`
add support for blobs (i.e. Slice(UInt8) )
close result_set and statements
main entry points for exec_non_query and exec_query closing them when ready
`#query`, `#exec`, `#scalar`, `#scalar?` as main query methods from `Database`
blocks overrides that ensure statements are closed.
extract QueryMethods modules and use it in Database and Connection
"add" finalize methods, but they cause GC warnings
close / do_close in result_set
avoid closing statements
remove named arguments
refactor positioned arguments query
add DB::ExecResult to have a concurrency safety exec result
remove connection string from abstract DB::Connection
hide methods from api
get/return from pool while result set is been used. still single connection pool
ensure exec, query can be executed with Enumerable(Any)
update db doc sample
class DummyConnection < DB::Connection
def initialize(db)
super(db)
@@connections ||= [] of DummyConnection
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why ||= instead of just =? Then you don't have to use not_nil! everywhere.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh... class variable...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still can remove .not_nil!, if we will define self.connections as:

def self.connections
  @@connections ||= [] of DummyConnection
end

and use everywhere else self.connection instead of @@connections.

@luislavena
Copy link
Contributor

This is huge @bcardiff, excellent work!

I've some questions about the implementation/usage.

In the sample from DB, is mentioned to read the values of ResultSet using read(String) and read(Int32)

From what I understand, that expects to read each of the fields obtained in a linear order.

With that in mind, is not possible to obtain one particular field in different order, say value(index : Int32) or value(name : String), forcing you to map to local variables first prior display:

rs.each do
  name = rs.read(String)
  age = rs.read(Int32)
  puts "#{age} - #{name}"
end

# :nodoc:
def prepare(query) : Statement
stmt = @statements_cache.fetch(query, nil)
if stmt.is_a?(Nil)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unless stmt?

@bcardiff
Copy link
Member Author

@technorama I see that having a reference abstraction across db's is something positive for any language.

Many languages have them, that is not a reason by itself, but I haven't seen complains about a standard interfaces for db's.

If, as with other parts of the std, they evolve to a shard explosion, it will be more than acceptable to split it. But, until then, a discussion about an accepted db api among the community is worth it IMO.

From the languages I have used in the past years, Ruby is one that didn't offer a std db api, hence activerecord came out with rails making bridges for different adapters. With a standard api some common issues can be addressed together (pool, api, etc) and it is a step forward in freedom to change db driver.

A similar argument might be stated that the language itself does not need to offer a reference api for files or http request. Maybe in the future the could be taken apart but, meanwhile, some shard have been made around the current crystal std, which otherwise might took longer to appear.

@technorama
Copy link
Contributor

@bcardiff I'm not against a common db api, only against putting it in crystal directly.

Making the db api part of crystal reduces flexibility for programs that needs to use a specific version of a database driver. Example:

somedb-crystal v2.0 requires libsomedb v2.0 and crystal-db v2.0.
somedb-crystal v1.0 requires libsomedb v1.0 and crystal-db v1.0.
What if my application requires libsomedb v1.0 for compatibility with an old somedb installation?
If the common db api is tied to the crystal installation I must downgrade my entire application to a lower crystal version and not use ANY shards that require a newer version of crystal. If the common db api is in it's own shard I can probably just include somedb-crystal v1.0 and use any compatible version of crystal including shards for that version of crystal.

Will database drivers such as postgres and mysql be distributed with crystal or will they be shards? If the drivers are shards they can list the common db library as a dependency with no additional work for the user.

HTTP and files may be used by any application. The common db api is only used by specific database drivers. If the drivers are shards there is no upside to it's inclusion in crystal, only downsides.

Making the db api part of crystal reduces flexibility when there are changes to the API (as there are bound to be). It would be better for the API to evolve independently of the crystal version.

@ysbaddaden
Copy link
Contributor

Some valid points raised here.

There is still one upside: having DB into the stdlib makes it the defacto standard to implement a DB driver, and avoids fragmentation. I'm all for including DB into the stdlib... eventually. Drawback is that the API will have to be backward compatible.

Maybe having a shard would help until the API stabilizes (ie. has at least 3 great drivers: Sqlite3, MySQL, PostgreSQL). We wouldn't require a specific Crystal build to try it for example, and would have a dedicated place to discuss and hack. This is the pattern I follow myself (I had a bcrypt repo that got merged, now I have http2).

Side note: I resisted merging Shards into the Crystal repository, because I prefer distinct projects (UNIX philosophy at work) and it could just be distributed along with Crystal.

@asterite
Copy link
Member

Another idea is to have it as a separate shard, but inside the crystal-lang organization. In that way it'll be recommended as the de facto standard to implement, but won't necessarily be inside the standard library.

@bcardiff
Copy link
Member Author

Yeap crystal-lang organization might be the right place. Meanwhile I will try to address the issues exposed by @will. I still need to narrow some things, play around with overload/restrictions to allow custom datatypes and other issues I reached (bcardiff@bedc24a)

I will need some days thought before having news. Back to work ⚡

@technorama
Copy link
Contributor

The API seems SQL specific. Shouldn't that be reflected in the module name? Prepared statements, transactions and string based queries aren't relevant to key value stores or many NOSQL databases.

@ysbaddaden
Copy link
Contributor

👍 for crystal-lang organisation! I'll move Shards there right away ❤️

I thought the same thing: is DB eventually going past SQL or are NoSQL databases too specific, using HTTP/JSON protocol for instance, that using something generic doesn't really make sense?

@bcardiff
Copy link
Member Author

I was mainly thinking on SQL DB, using names like RDB or SqlDB as the main entry point seems awful :-P. If someone sees something blocking in the api for non sql database feel free to point it out. Otherwise I will like to stick with DB as the name.

@technorama
Copy link
Contributor

String based queries, prepared statements, and placeholders are mostly specific to SQL databases. Nothing beyond .new and .close are universally applicable to all databases.

Whatever namespace the Db portion is put under is unlikely to be used in much code. The largest consumer would be the adapters followed by higher level API's (like ActiveRecord/Sequel/Datamapper clones). Db::Sql or whichever namespace is chosen is unlikely to be typed or seen often. Please make it descriptive.

@ysbaddaden
Copy link
Contributor

Maybe SQL::DB? We could then extend the SQL namespace with things like SQL::Query for example.

I'll be fine with just DB, though.

@asterite
Copy link
Member

I'm fine with just DB.

@waterlink
Copy link
Contributor

I think keeping the context down to SQL is a good idea, since SQL is a standard. Though it should follow open/closed principle, so that different drivers can register custom functionality.

Am 26.02.2016 um 1:18 PM schrieb Ary Borenszweig notifications@github.com:

I'm fine with just DB.


Reply to this email directly or view it on GitHub.

@technorama
Copy link
Contributor

DB is not descriptive. There are many types of databases. What namespace should key value databases use? Hierarchical databases? Others?

This module is unlikely to be used directly doesn't need to be short for programmer comfort. Very few people use mysql/postgres/etc drivers directly. The only subclasses will be drivers and the direct consumers are likely to be ORM's or other data mapping modules.

A terse name is not advantageous (extremely few direct users). A descriptive one is (may show up in error messages, be seen in the class browser, or when browsing modules).

@technorama
Copy link
Contributor

How can someone contribute to this project? Submitting patches to this PR will not give credit in githubs system.

@bcardiff
Copy link
Member Author

bcardiff commented Mar 9, 2016

I moved the commits to https://github.com/bcardiff/crystal-db in order to be able to iterate independently of the compiler.

I am still making some tradeoff to achieve supported db types extensibility. Probably loosing some runtime time checks around these, instead of compile time :-(.

We can keep this PR around for a little while to discuss some overall design / news. When a first usable version of that shard is usable it will be fine to start issues there. If everything looks right It will eventually be moved to the organization probably.

@ozra
Copy link
Contributor

ozra commented Mar 9, 2016

I also think "DB" is way to generic for an SQL-module. I use a lot of database architectures for different purposes. SQL == DB feels pretty outdated...

@asterite
Copy link
Member

asterite commented Mar 9, 2016

Just a note: there's nothing SQL-specific in this DB interface

@waterlink
Copy link
Contributor

@asterite You are correct about SQL-specific. It is not SQL-specific, since SQL is a Language. Rather, it is very Table-specific, since it works with Tables, Columns and Rows. You can see it in tests and code all over the place. Additionally, the query at the moment looks like can be only a String, it can't be some nice object, or a raw data stream.

So my suggestion is to reflect this in the final name: DB::Tabular or DB::Relational. Whereas this creates a possibility to create DB::GraphBased, DB::DocumentBased or something along these lines. For these databases, you can, of course, write a tabular adapter, but it will miss a lot of specific features of aforementioned database types.

@waterlink
Copy link
Contributor

IMHO, being more specific like that, will allow to create a better APIs for it.

@asterite
Copy link
Member

asterite commented Mar 9, 2016

Are there generic drivers in any language for non-sql databases?

@waterlink
Copy link
Contributor

@asterite Not that I know of. It might even not make sense, since SQL is a standard, that is fairly simple to narrow down to a nice public API with implementation provided by drivers. All other types of databases are usually pretty unique, so such abstract API might not make sense for them..

@asterite
Copy link
Member

asterite commented Mar 9, 2016

@waterlink Exactly! So if you are going to use Mongo you'll probably do Mongo.some_method. Or if it's Raven you'll do Raven.some_method. So we could make it to be DB::SQL if we wanted to, but there seems there won't be any other type inside DB, because there's no standard for databases other than SQL-oriented databases.

That said, I wouldn't mind using DB::SQL for this (in Java it's java.sql, in Go it's database.sql), specially because you'll do DB::SQL.open and never have to specify DB::SQL again inside that block of code.

@waterlink
Copy link
Contributor

@asterite It doesn't necessary need to be DB::SQL or SQLDB, or anything like that. As you said, it does not mention SQL Language anywhere, rather it mentions a lot, that it works with connections, queries, tables, columns and rows. It could as well be a database driver, that works with CSV file, easily.

That is my reasoning to come up with DB::Tabular or DB::Relational. Maybe something like TableDB.

If we end up calling it just a DB, somebody will end up implementing a Mongo (or whatever) driver for it, but the public APIs will lack half of the features, that MongoDB provides.

@bcardiff
Copy link
Member Author

bcardiff commented Mar 9, 2016

Ok, it won't be plain DB. I would like to avoid something SQL specific, up to now I was able.
DB::SQL, DB::Tabular or DB::Relational seems good candidate to be.

@waterlink
Copy link
Contributor

@bcardiff I have read the code on https://github.com/bcardiff/crystal-db and left some questions/comments.

Overall the API looks really good and solid. 👍

I'm kinda worried about the fact, that most of public APIs require implementor to subclass some of the DB::* abstract classes, but I suppose it is not possible to do it in any other way, since we don't have a construct like Interface or Protocol in Crystal, yet.

@bcardiff
Copy link
Member Author

bcardiff commented Mar 9, 2016

Thanks @waterlink , I've just reply them.

I'm kinda worried about the fact, that most of public APIs require implementor to subclass some of the DB::* abstract classes [..]

At least for now I feel comfortable with the classes design. Some of the have some state. Drivers will probably be either lib wrappers or protocol implementors over sockets. In either case having a base class does not seems to be a problem.

@ozra
Copy link
Contributor

ozra commented Mar 9, 2016

+1 on Tabular, since that's the common theme. The example of the CSV for instance, is not relational by definition. But Relational sounds more beautiful B-)

def self.next_column_type=(value)
@@next_column_type = value
end

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might miss something, but this looks as if a class var is used to transport non-global state?

@asterite
Copy link
Member

asterite commented Jul 1, 2016

I'm closing this as there's a repo tracking this: https://github.com/crystal-lang/crystal-db

Eventually we can consider adding it to the standard library, though maybe it's better if kept as a separate shard: it can evolve without needing compiler releases, and it being in the crystal-lang organization has a bit of weight.

@asterite asterite closed this Jul 1, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants