Snowflake’s native handling of JSON in both READ and WRITE operations is by far and away my favorite feature. For reading JSON I love:
JSONDoc:Schema:Element::Cast
JSONDoc:Schema[0]:”Element”::Cast
JSONDoc:Schema:NestedSchema:Element::Cast
Lateral Flatten(Input => Parse_JSON(JSONDoc:Schema:Array)
Have you ever had a new-field or field-rename break your ETL pipelines to the Data Warehouse and ultimately your consumers of data? If you have ever done any data project, the answer is undoubtedly “yes”. To solve this ages-old problem with data, the confluence of some tenants of Hadoop’s schema-on-read and Snowflake’s Architecture separation come together in beautiful harmony on the Snowflake Data Cloud. In Snowflake, you can have a hybrid relational and JSON table in the same table. …
The Snowflake Data Marketplace is quickly becoming one of my favorite things about working at Snowflake. With the marketplace, I have live, instant access to some of the world's best datasets to give my data deeper context and meaning.
I had the pleasure of recently presenting a virtual hands-on lab (recording) with Snowflake using Starschema’s COVID-19 dataset (link) and Wunderman Thompson’s COVID-19 Risk, Readiness and Recovery dataset (link) built on the backbone of their fabulous Amerilink Insights dataset. I highly recommend both datasets to drive a deeper understanding of our collective pandemic context. The best part is they are both…
Last week I published a blog article showcasing how to create a COVID-19 quadrant scatter plot by county and turn the quadrants into a map by county https://bigdatadave.com/2020/11/22/tableau-snowflake-covid-19-by-county-quadrants-maps/. As you can see I had a problem with the big hole in Utah. “Utah is reporting county data somewhat differently than many other states. The larger-population counties are reporting confirmed cases and deaths at the county level. However, the smaller counties are banded together into county groups. This is in an effort to protect identities of individuals.” I just had to fix this as it was ruining my beautiful map.
Hole…
With the COVID-19 crisis worsening in America right now, I was curious which areas of the country are being affected more than others? I have been producing loads of SQL over the JHU data set provided by Starschema to the Snowflake Data Marketplace. I thought if I could enhance the JHU data at the county (FIPS) level with population, I could normalize Cases and Deaths per 100k population by County in the United States. Once I had this data in hand, I could produce a map showing a quadrant of counties by Case and Death rates. …
VS Code is the preferred IDE for many folks developing code for data and analytics. Now there is an extension allowing you to develop and execute SQL for Snowflake in VS Code. Huge thank you to Peter Kosztolanyi ( in) for creating a Snowflake Driver for the popular SQL Tools IDE extension for VS Code as documented here: https://marketplace.visualstudio.com/items?itemName=koszti.snowflake-driver-for-sqltools.
How to Get the Extension
Open VS Code and navigate to the extensions plugin and search for ‘snowflake.’ Install the free extension by clicking the green ‘Install’ button on the ‘Snowflake Driver for SQL Tools’ extension.
How to Use the Extension…
I often get asked for better ways to visualize or interact with task dependencies and hierarchies beyond show tasks. I was on a call with one of my favorite clients from Sunnyvale last Thursday, and we chatted through methods of using Snowsight to do this. With these SQL snippets and Snowsight charts, you, too, should be able to understand better your tasks, their run history, and dependencies. Just in case you do not have dependant tasks yet, I have included a brief sample set of dependent tasks.
Setting the Session Context First, if you have not already done so, navigate…
With pass-through SQL in Tableau and Snowflake you can set session variables in Snowflake to deliver on a variety of use cases. You can create dynamic derived tables, set database security contexts, route queries or expand database connectivity beyond your imagination. I already went over passing in data elements from Tableau to a Snowflake UDF (here). You can set session variables in Snowflake or just pass in parameter values from Tableau to do something in the database or to the data returned from Snowflake to Tableau.
In this example we will cover basic methods to pass in a session variable…
So, you might have some long running queries in your Snowflake account? Let’s say someone is running a recursive cartesian join because they used this WHERE clause “and (created_on >= ‘2020–08–03 00:00:00’ and created_on < ‘2020–08–31 00:00:00’) or (created_on >= ‘2020–02–03 00:00:00’ and created_on < ‘2020–03–02 00:00:00’)” instead of this one “and ((created_on >= ‘2020–08–03 00:00:00’ and created_on < ‘2020–08–31 00:00:00’) or (created_on >= ‘2020–02–03 00:00:00’ and created_on < ‘2020–03–02 00:00:00’))” …maybe that was me on SnowHouse yesterday…order of operations matter people. Here I never thought 8th grade Algebra II would come in handy in life. …
Snowflake is architecturally different from almost every traditional database system and cloud data warehouse. Snowflake has completely separated compute from storage and both tiers of the platform are real-time elastic. The need for advanced resource planning, combing over workloads with a fine-toothed comb, and denying new workloads onto the system over fear of disk and CPU limitation just goes away with Snowflake. Being a cloud native data warehouse, Snowflake can instantly scale to meet planned, ad-hoc or surprise growth. …
David spends his time working with Web-Scale Data Applications built on Snowflake with the largest tenants on the Snowflake Data Cloud.