1 00:00:00,012 --> 00:00:05,719 [MUSIC] 2 00:00:05,719 --> 00:00:09,226 [SOUND] Writing SQL to discover the information stored in a database is often 3 00:00:09,226 --> 00:00:10,180 fun and rewarding. 4 00:00:10,180 --> 00:00:14,520 It can also sometimes feel like you're working your way down a rabbit hole. 5 00:00:14,520 --> 00:00:19,490 Creating complicated queries, joining multiple tables, and building complex 6 00:00:19,490 --> 00:00:24,680 reports with aggregate data often requires a lot of crazy looking SQL. 7 00:00:24,680 --> 00:00:29,030 What's worse is having to dig your way through the confusing logic of someone 8 00:00:29,030 --> 00:00:34,050 else's query, that includes nested sub queries and lots of joins. 9 00:00:34,050 --> 00:00:35,030 In this workshop, 10 00:00:35,030 --> 00:00:39,240 I'll teach you how to make your queries easier to read, easier to think about and 11 00:00:39,240 --> 00:00:43,580 easier to build using what's called a Common Table Expression or CTE. 12 00:00:44,860 --> 00:00:48,050 Even better, CTEs are really simple to learn. 13 00:00:48,050 --> 00:00:52,848 Most popular databases support CTEs, like Oracle, SQL Server, 14 00:00:52,848 --> 00:00:58,630 PostgreSQL, Redshift, SQLite, and MySQL version 8 and later. 15 00:00:58,630 --> 00:01:03,630 A common table expression is nothing more than an SQL query that you name and 16 00:01:03,630 --> 00:01:05,680 reuse within a longer query. 17 00:01:05,680 --> 00:01:07,360 It's a temporary result set. 18 00:01:07,360 --> 00:01:09,330 Think of it as a temporary table. 19 00:01:09,330 --> 00:01:11,660 Creating a common table expression is easy. 20 00:01:11,660 --> 00:01:16,390 You place a CTE at the beginning of a complete query using a simple syntax. 21 00:01:16,390 --> 00:01:20,960 Begin with the keyword WITH, followed by a name, the keyword AS, and 22 00:01:20,960 --> 00:01:23,740 a SELECT statement inside of parenthesis. 23 00:01:23,740 --> 00:01:28,210 The CTE name is like an alias for the results returned by the query. 24 00:01:28,210 --> 00:01:32,470 You can then use the name just like a table name in queries that follow the CTE. 25 00:01:33,940 --> 00:01:36,050 Lets look at a simple example. 26 00:01:36,050 --> 00:01:40,130 If you'd like to follow along click the SQL playground button on this page 27 00:01:40,130 --> 00:01:43,040 to open our online SQL tool. 28 00:01:43,040 --> 00:01:47,160 I'll start with a simple query to retrieve data from a business database 29 00:01:47,160 --> 00:01:49,880 containing product and sales information. 30 00:01:49,880 --> 00:01:53,140 This query combines data from two tables. 31 00:01:53,140 --> 00:01:59,660 I'll select a product name, a category name, a unit price, 32 00:01:59,660 --> 00:02:04,720 and the number of units that we have in stock for that from the products table. 33 00:02:04,720 --> 00:02:07,190 And then I join that to categories. 34 00:02:07,190 --> 00:02:09,510 And I'll simply match up the category ID, 35 00:02:09,510 --> 00:02:13,470 which is in the products table, to the ID in the categories table. 36 00:02:14,620 --> 00:02:18,810 I also want to make sure that I'm only looking for products that we still sell. 37 00:02:18,810 --> 00:02:21,780 In other words, products that are not discontinued. 38 00:02:21,780 --> 00:02:24,200 When I run this, I get a complete list of products, 39 00:02:24,200 --> 00:02:27,390 the categories they belong to, as well as pricing data. 40 00:02:27,390 --> 00:02:32,720 Nothing that special, but as I said, a CTE is just a result set. 41 00:02:32,720 --> 00:02:38,000 So I'll turn this into a CTE by adding the WITH keyword, a name, 42 00:02:38,000 --> 00:02:40,630 AS, and wrapping the query in parentheses. 43 00:02:43,230 --> 00:02:48,730 And to format this little bit, I'll indent inside the parentheses. 44 00:02:48,730 --> 00:02:51,270 Now, when I run this, I get an error. 45 00:02:51,270 --> 00:02:54,960 The code I just wrote only creates the common table expression. 46 00:02:54,960 --> 00:02:58,960 To build a complete query, we must had a select statement to query it. 47 00:02:58,960 --> 00:03:02,119 Like I said, the common table expression acts just like a table. 48 00:03:02,119 --> 00:03:06,600 So you can query against the CTE name like this. 49 00:03:06,600 --> 00:03:10,091 I'll select everything from product details. 50 00:03:10,091 --> 00:03:16,440 And then I'll order it by the category name and the product name. 51 00:03:19,338 --> 00:03:21,910 The name, product_details, 52 00:03:21,910 --> 00:03:25,900 gives us a clear idea of what kind of data we're including in this query. 53 00:03:25,900 --> 00:03:31,120 And since the CTE acts just like a table, we can query it just like a table. 54 00:03:31,120 --> 00:03:35,360 For example, say I wanted to determine how many different products and 55 00:03:35,360 --> 00:03:38,250 the total stock count for each category. 56 00:03:38,250 --> 00:03:41,270 I can just modify the SELECT statement to get at that data 57 00:03:41,270 --> 00:03:42,850 with the common table expression. 58 00:03:46,400 --> 00:03:49,070 I'll select the category name but this time, I'll do a count. 59 00:03:49,070 --> 00:03:53,580 And this will be the unique product count. 60 00:03:53,580 --> 00:03:57,110 I'll also sum or total up the number of units we have in stock. 61 00:03:59,650 --> 00:04:02,600 And I'll get that from our CTE product details. 62 00:04:03,790 --> 00:04:10,134 I'll group them by category name, and then I'll order them by count. 63 00:04:12,677 --> 00:04:16,820 This is really all there is to creating and using common table expressions. 64 00:04:16,820 --> 00:04:19,380 But you might be wondering, why use a CTE? 65 00:04:19,380 --> 00:04:21,770 These examples aren't all that exciting. 66 00:04:21,770 --> 00:04:24,880 And you can get the same results with fewer lines of code. 67 00:04:24,880 --> 00:04:27,530 That's true, and you probably won't use a CTE for 68 00:04:27,530 --> 00:04:30,020 a short query like the one I just showed. 69 00:04:30,020 --> 00:04:34,690 However, CTEs provide several key benefits for organizing long and 70 00:04:34,690 --> 00:04:36,800 complex SQL queries. 71 00:04:36,800 --> 00:04:40,770 First, your code is more readable using CTEs. 72 00:04:40,770 --> 00:04:42,010 As I mentioned earlier, 73 00:04:42,010 --> 00:04:45,910 the named query gives you information about the returned results. 74 00:04:45,910 --> 00:04:48,760 Names like product details, active customers, or 75 00:04:48,760 --> 00:04:53,380 recent transactions are clear and understandable when you see them. 76 00:04:53,380 --> 00:04:55,244 Readable SQL is a big help for 77 00:04:55,244 --> 00:04:59,285 other folks who have to work with your queries and a big help for 78 00:04:59,285 --> 00:05:04,201 you when you later have to return to your SQL and remember how a query works. 79 00:05:04,201 --> 00:05:09,800 Second, CTE's help you organize queries into reusable modules. 80 00:05:09,800 --> 00:05:13,600 Look at the code we've written, I can easily just grab the CTE, 81 00:05:14,790 --> 00:05:19,540 copy it and then paste it into a new query, knowing that I've got 82 00:05:19,540 --> 00:05:23,990 all the relevant SQL needed to capture the product details I'm after. 83 00:05:23,990 --> 00:05:28,290 Likewise, you can combine multiple CTEs into a single query. 84 00:05:28,290 --> 00:05:30,610 I'll show you how in the next video. 85 00:05:30,610 --> 00:05:35,150 So you could piece together CTEs almost like Lego blocks to create complex ways 86 00:05:35,150 --> 00:05:37,030 to look at your database data. 87 00:05:37,030 --> 00:05:40,870 Finally, organizing queries into common table expressions 88 00:05:40,870 --> 00:05:44,400 matches how we think through data analysis problems. 89 00:05:44,400 --> 00:05:47,800 We often think of our queries as result sets to begin with. 90 00:05:47,800 --> 00:05:52,250 For example, say you wanted to figure out which product categories were most popular 91 00:05:52,250 --> 00:05:54,100 with which customers in the past month. 92 00:05:55,570 --> 00:05:57,430 You might start off with something like okay, 93 00:05:57,430 --> 00:06:00,220 first I need all the orders in the past month. 94 00:06:00,220 --> 00:06:02,170 Then I need all the active customers. 95 00:06:03,370 --> 00:06:08,040 And finally, a list of all of our products and their categories. 96 00:06:08,040 --> 00:06:10,450 It's natural to think of queries in that way. 97 00:06:10,450 --> 00:06:13,480 And common table expressions help you write queries that match your 98 00:06:13,480 --> 00:06:14,820 thought process. 99 00:06:14,820 --> 00:06:20,320 For example, each comment here would represent one common table expression. 100 00:06:20,320 --> 00:06:22,660 In the next video, I'll show you a common use for 101 00:06:22,660 --> 00:06:27,370 common table expressions, converting subqueries to more readable code. 102 00:06:27,370 --> 00:06:29,143 You can close the SQL playground here. 103 00:06:29,143 --> 00:06:31,310 I'll have a new playground for you in the next video.