Advanced query engine

Advanced NoSQL Query engine

Databases, obviously, are made to store data. But just as importantly, they need to fetch that data in response to queries, and they should do it fast.

Here we’ll tell you about all the tools RavenDB gives you to map your documents with indexes, analyze text and spatial data, project your data into new shapes, and more. We’ll also explain how RavenDB automatically learns from your users’ queries so it can respond to them more quickly. Just a few of our features include:

As information grows exponentially, and users demand more and more, querying use-cases are more complex than ever. Most databases can’t keep up, but we at Hibernating Rhinos are ahead of the curve with our advanced query engine. Our unique approach is flexible, scalable, performant, and will help your business continue to thrive in a data driven world.

RavenDB’s querying pipeline features full-text search out of the box, supported by an embedded Lucene.NET indexing engine. The power of queries is enhanced by a JavaScript interpreter, which lets you run code to perform ad-hoc aggregations and projections. Our intuitive Raven Query Language will look very familiar to native SQL speakers.

Database flexible sorting

Version 6.0

In version 6.0, RavenDB introduced Corax, a new indexing engine built from scratch and tailored for the non-batch processing of documents, which better suits actual scenarios of real-life database usage. The same types of data structures that Lucene (our old indexing engine) was computing and holding in memory are now precomputed and stored on the disk. Consequently, Corax uses significantly less memory while eliminating notoriously long execution time on cold queries. Both Lucene and Corax are available as indexing engines, and you can choose on a per-index basis which one to use.In our tests, Corax is faster for both indexing and queries, often exceeding Lucene performance by a factor of 10.

Intelligent Indexing

Google and Amazon have discovered that even tiny delays shorter than a second can significantly reduce user retention. This is why RavenDB indexes update in the background rather than locking your data: it’s better to serve a query with data that’s a few milliseconds out of date (or “stale“) than to keep your users waiting. With each query you’re informed of the index’s exact status so you can make the best decision on how to use it.

RavenDB learns from past queries to expand and optimize its index repertoire. When RavenDB receives queries that can’t be satisfied by any existing index, it doesn’t just make the same costly full table scan over and over. It automatically creates a new dynamic index which will satisfy all similar queries in the future, resulting in an immediate boost in speed with no work on your part.

Using your custom, or “static” indexes, your server will efficiently answer almost any query your users can imagine. Indexes can cover multiple fields and collections, perform complex data manipulation, and are highly configurable to suit your needs. Learn more about how indexes work in RavenDB here.

Full-text search database

Your Querying Toolkit

RavenDB queries have many capabilities that our competitors lack, like:

RavenDB comes equipped with several Lucene analyzers that slice text into the searchable tokens that can be queried. Each analyzer takes a different approach to reflect the subtleties of language, taking into account: numerical and lexicographic order, case, word stemming, punctuation, symbols, or even identifying email addresses. The Ngram analyzer breaks words into even smaller tokens of a customized number of characters. Lucene also allows you to add and create new analyzers, and prioritize search results according to your business logic.

 

Facets

Aggregate one query’s results into a multitude of categories using facet queries.

You’ve probably seen facets while shopping online. Imagine you have a website for selling cars. You could display your catalogue divided by vehicle type, or by manufacturer. By gas mileage, or by safety features. Maybe you’d allow your user to define a price range – or why not several.

With a single facet query, you can organize data into all of these categories and numerical ranges at once, as well as list the number of results in each category. Your users can then mix and match facets until they find the perfect fit.

Spatial queries

GPS technology has flooded us with endless geospatial data – but with spatial queries, you’ll never lose your bearings. You could find every cafe within a mile of a train station and sort them by closest first – but much more than that. Use one query to divide the world into custom WKT regions of any shape and size, then overlap them in any combination. Are you looking for a city? Or a pet cat with a microchip? Fine-tune the accuracy to your needs. The word “where” has never been so specific.

Server-side projections

Project your data into new shapes using RQL and the full extent of JavaScript code. At their most basic, projections can improve efficiency by taking massive documents and narrowing them down to only the data you want. But they can also get you more information out of the same data.

Separate words can become a sentence. Numbers can be plugged into calculations. Sequences can be sorted in a new order. Flat lists can be projected into hierarchies or networks. Do it all on the server side, and let your application relax.

Having one GB of data doesn’t mean you only have one GB of information. And by calling a few JavaScript methods, or defining your own, your query can do more than just fetch.

Database flexible sorting

Flexible sorting

When you have to load a long stream of documents, it helps to know which ones you want first. Unlike in other databases, the same index can let you sort data in one order just as easily as another order.

Results can be sorted ascending or descending, alphabetically, numerically, chronologically, and by Lucene index scores. Why not sort in two dimensions? If you have spatial data, you can order by distance. Need more power? Add or create new sorting logic. Like to be surprised? Sort random.

But the real power of this feature comes from combining multiple sorting orders, then sorting your sorting orders until they perfectly reflect your priorities.

With includes, your results can be amplified by fetching additional related data – without making additional queries.

Sometimes your query depends on making another query. Like when you want to know someone’s phone number…but you need to check an email to remember their name first. This would normally take two round-trips to the server – often the most time consuming and expensive part of a query. So why not tell the server to fetch the email and also look up that phone number so they can both be sent at once? Two queries, at the cost of one.

By modeling your documents so they ‘know’ about each other, a simple query can fetch a lot more data. When you want to query that other data, it’s already here.

Map-reduce indexes

Aggregations use many documents to answer questions that no one document could, such as computing sums, averages, finding a maximum value, etc.

Our queries are great at aggregating data, but our Map-Reduce indexes can do you one better. For your heaviest and most complex aggregations, define a Map-Reduce pipeline on the server side so your results are ready long before you need them. Whenever data is added or modified, the index will keep the aggregation up to date. With a little preparation, your users will be able to sum a million documents as easily as they fetch one.