I’ve been working as a data and software engineer for more than 20 years. Not long after I joined my current employer Sounding Board, I had to normalize nested JSON arrays in a complex document schema so that I could join the child records to other collections and then denormalize data into a single result set — and I had to do it fast.
On top of that, I had to make that data available to our custom-built application via a secure RESTful endpoint with a less than one second response time. By day three of my new job at Sounding Board, I was able to meet those requirements, build, and demonstrate a real-time, reporting and analytics application using Rockset and Retool. I was amazed that I could do all of that without having to initially move and transform the data. One SQL statement got it done. Here’s how Rockset made me a day three hero at Sounding Board.
One of the technical challenges I had to tackle at Sounding Board was our need to report on deeply nested JSON data in a document database. Our plan — the same plan I would have used if I had not known about Rockset — was to build an ETL package, extract the data from the document database, then transform it into a format that would be stored in a data warehouse.
From there, the data could be ingested by any standard reporting tool. This approach would have worked, but it would have also been very time-consuming to build, would have required ongoing maintenance, and would have cost more.
DAY 1
On day one at Sounding Board, in the middle of being introduced to my team and completing the onboarding process, I was able to get read-only credentials to the MongoDB development database. From there, I simply created a free Rockset account and used Rockset’s MongoDB data connector to ingest the nested JSON data into a Rockset collection.
Rockset is a real-time database built for real-time analytics. I haven’t encountered another tool on the market that could have allowed us to produce a deliverable with this type of data so fast. It’s also a great relief knowing that as we grow, we don’t have to worry about performance degradation.
We were very impressed by Rockset’s Converged Index. Getting to see it in action with our own data was amazing. Using the search index component of the Converged Index allowed us to reduce the response time for a very complex multi-join query with multiple unnesting statements from 3500ms to 159ms.
DAY 2
On day 2, as I was learning a data schema I had never seen before, I was able to write the SQL, with some amazing help from Rockset. I extracted a string value containing deeply nested JSON data with multiple arrays, subdocuments, sub arrays, etc., and produced a flattened, denormalized dataset with all of the information I needed to supply to Retool.
One of my most favorite parts of the SQL statement was an awesome function called UNNEST(). This function allowed me to take an embedded array from my JSON document and turn it into the equivalent of an inner joined relational child table. From there, I was able to create a Rockset Query Lambda which is what produces the secure, managed, scalable, RESTful endpoint.
You can use this endpoint (i.e. the Query Lambda) in a POST request for any app or reporting tool that supports RESTful data sources. Rockset also has a JDBC driver. I ended up using this endpoint in Retool. When Retool executes the POST request, I get the result of my query as a JSON document.
By the end of day two, I had developed a simple Retool application that allowed me to pass in a couple of parameters to the Rockset Query Lambda, and voila! I had a web app that could access this treasure trove of data.
DAY 3
On day three, as I finished up the Retool app, I began to demonstrate the app and show various stakeholders the data they were longing to see. My manager, the vice president of engineering, was blown away by the speed at which I could not only access the data, but turn it into usable and reportable information. Needless to say, we are successfully using Rockset today to solve many other data challenges including creating new analytics to help our customers measure the return on investment they are making in leadership coaching. Our new coaching management platform will give them up to date access to rich analytics enabling them to successfully manage their coaching engagements.
Jon Farr is a principal data architect at Sounding Board.
Rockset is the real-time analytics database in the cloud for modern data teams. Get faster analytics on fresher data, at lower costs, by exploiting indexing over brute-force scanning.