Data QnA: An exploration of a conversational query interface through the work of Stanley Kubrick

Data QnA is a recent feature from Google that builds on some underlying research from the Analyza – a system designed to parse, converse and interpret natural language queries across data.

You may have seen this in action under the hood without realising it – likely in either Google Analytics or Google Sheets. The examples below illustrate how you can enter a free text natural language question and retrieve an answer (as well as some suggestions below).

Although it’s not clear from the outset – both of these features in Google Analytics and Google Sheets are powered by Analyza and represent early efforts to bring a natural language query interface to data that is highly structured.

QnA extends upon this idea of democratising data with users by lowering the bar to entry and extending a similar user interface to BigQuery data via the ‘Ask a Question’ functionality.

The art of converting a question from natural language to an answer is a fundamentally hard problem. Understanding a question and providing a clear, concise answer is a far cry from where expert systems first gained mass popularity in the 1980s.

Much of the research up until this point has focused on QA answer systems that are far less conversational, and often a question would be translated into SPARQL – a query language for RDF data rather than the (now) far more common SQL interface to data.

In this post we’re going to explore just what’s possible using an example data set – the works of director Stanley Kubrick which include arguably the most famous conversational interface of all time – HAL9000.

To start with we’ll need to load our dataset into a BigQuery table (a view will also work here) and set up the table for indexing in QnA.

In order for QnA to comprehend this data there’s a small amount of upfront work in helping adding some context, as well as some background work that QnA does to index the data into an appropriate format.

The manual work includes defining what rows represent (e.g., events, customers, movies) for each data source as well as any synonyms and display names for columns that may be in a table.

This is an important step as it ensures that questions can be asked using synonyms for each field rather than relying upon an exact match. The interface itself will perform some basic transformation for you (e.g., tokenizing customer_id into customer id) but other than that you’ll need to provide your own mappings.

After this is complete an index of the data will kick off – this typically only takes a few minutes but may vary depending on the size of your dataset as well as the cardinality within certain columns.

Once our data is indexed we can immediately begin asking questions – either through the BigQuery console directly or alternately through a provided API which I’ll cover in a separate blog post.

The process of asking a question largely involves a few stages from the perspective of a user.

  1. Selecting a view or table to issue the query on – descriptive table names are important here. Consider putting any QnA ready tables into their own dataset.
  2. Suggestion of a question. Questions can be auto suggested – but these aren’t typically things you would ask, so much as a demonstration of what is possible. As you input more context suggestions will be returned for example questions on that field.
  3. Interpretation of a question (restating or asking for clarification / disambiguation). If a question has some ambiguity then QnA will provide a list of interpretations that you can choose from – ranked from the most to least relevant.
  4. Compilation of the interpretation to an equivalent SQL statement
  5. (Optional) Execution of that statement against a target (in this case a BigQuery table or view)
  6. (Optional) Provide optional feedback (thumbs up, thumbs down) along with a comment for the interpretation. This feedback isn’t currently fed back into the suggestion model but it is useful in analyzing and debugging queries that may not be performing as expected.

Suggestions

A question that begins with ‘How many’ will lend itself towards interpretations that focus on the summation of distinct counts of values in a certain column. Suggested queries are ranked on their salience and relevance. Generally these suggestions make sense but sometimes they can be awkwardly phrased, but interpretable depending on the column names.

Interpretations

A critical component in QnA is that it restates the question that has been translated from natural language into an intermediate representation. This statement is typically declarative and provides enough context to determine if the interpretation is consistent with what the user is asking.

Linguistically our question below How many movies do I have? is translated from a request to an intent unique count of film.

In the event that our question poses some ambiguity QnA falls back to enumerating a list of possible interpretations. It is deliberately conservative here – aiming to ensure the question being asked is correct rather than performing a probabilistic match on the most likely interpretation. This is a sensible design decision – if users are relying on a correct answer it’s critical that the understanding of the question is correct.

Compilation

Once the intent of a query has been established it can then be compiled from an intent to a query.

For example asking the question:

> What films have been nominated for best director?

yields

Execution

QnA is capable of executing the question for you – though in the user interface this is through either copying the query or opening the query in the BigQuery editor. The API provides a more direct mechanism that is capable of executing the question and returning a BigQuery job id – which can then be asynchronously polled using the BigQuery jobs.get API. BigQuery is pretty fast – so you can generally expect an answer within a few seconds.

Tips for using QnA

  1. Make sure your column names are sensible and understandable by humans
  2. Setup clear synonyms during the indexing stage
  3. Use cleaned, verified data models where possible rather than the raw source of data
  4. Consider localising any timestamps rather than using UTC if applicable
  5. QnA may not get a perfect interpretation every time – I’ve found that asking questions that follow the co-operative principle lends itself to better, more consistent interpretations
  6. Columns that display high cardinality (such as unique ids) are not likely to be indexed
  7. If applicable ensure you define a column with an associated date / timestamp in the indexing stage. This ensures that time series questions can be answered.
  8. Ensure users use the feedback mechanism to rate questions – this is a good proxy as to what questions are working well and what needs improvement.

What’s next?

QnA is a fantastic exploration into what it looks like to try and ‘democratise data’ to a wider audience than those who are comfortable writing SQL. It certainly isn’t going to supersede the requirement for those comfortable with SQL but it is going to complement access for users with no or minimal SQL experience. The ask-interpret-clarify-execute loop in itself is likely to be a valuable upskilling tool for many.

The conversational aspect with QnA is reasonably minimal at the moment, but I think we can expect this to expand and continue to integrate with other GCP services that are a natural fit. The recently announced Dialogflow CX lends itself to building experiences that are accessible outside of the console (e.g., Slack, voice and mobile) to further increase the potential audience.

At the moment support for data types is mostly restricted to the most common primitive types. More complex queries – those involving GIS (geographic) and nested operations are probably on the horizon – but at this stage your audience changes and it may not necessarily make sense to provide a 1:1 mapping between natural language and BigQuery SQL.

One of the final, and largest challenges that QnA may tackle is being able to link data to Google’s own knowledge graph.

This has the potential to move the service away from a purely syntactic interpretation of a query to one that semantic understanding.

Although this is a much larger problem it allows for things that aren’t currently possible such as concept expansion i.e., understanding that Best Director is one of many categories of Academy Award or understanding that the APAC acronym includes countries in the Asia Pacific region (rather than the literal string interpretation).

The future. Are we there yet?

Not yet but we’re well on our way. Conversational interfaces are starting to look like one of the few survivors of the chatbot hype cycle.

Prior to 2001 Kubrick asked the production team to come up with some hypothetical headlines to display on NewsPads – today’s equivalent of tablet devices. These never made it into the movie and some of them miss the mark, others though are eerily prescient.

Interested in how QnA might fit into your business? Drop us a line over at the contact page (and we’ll try and be more helpful than HAL, I promise).

 

 

Published by Mike Robins

CTO at Poplin Data

Popular posts like this

Mike Robins

Why you need to learn SQL

v0.2.14 of the Snowplow Inspector Released

Poplin Data Retail Week