Readings

#12
Many in class had trouble with the notion of identifying/non-identifying relationship. Review the vLab
on mySQLWB (10:00 forward) and in particular the section on the concept of identifying and non-identifying relationships.
Literacy: define identifying vs non identifying. Find a couple of examples and be ready to explain why they are identifying / non-identifying.
Bring questions on the Digisonos/UBC project, if any.
We will resume the WINIT section.
#13
A Pivot Table is a data analysis tool. It is used to "slice and dice" data, and drill it down. Pivot tables work by grouping and ungrouping data contained in a larger table, according to criteria specified by the user. Pivot tables can also transpose rows and columns, and perform calculations such are sub totals and grand totals.
The first 11 minutes of this video will give you an extensive example of how to create an use a pivot table with a small data set.
This 45 second video and short text by Microsoft is a quick reminder of how to create pivot tables from data in
Excel.
This 90 second video and short text is a summary of how to manipulate pivot tables in Excel.
#14
Read chapter 7 of the textbook. It covers Data Warehouses.
Literacy: DW, differences between transactional and analytical or informational DBs, subject/application oriented, Vitality Health Club
example, time-variant, non-volatile, ETL, Data mart.
#15
Read Chapter 8 of the textbook up to "A note about comparing...". It covers DW design and is important for the next homework.
Literacy: It is important that you explain these with your own words and some simple examples, best if not the same as the book: fact table, measures and dimensions, dimensional modeling, granularity, snowflake, star schema, slow changing dimensions and Type 1, 2, and 3 approach, Inmon, Kimball.
#16
Read the instructions to procure the Tableau software. If you plan to install it in your
computer, start soon. It will take a little time to get the license.
Next, go through the Tableau
training. Go through the steps, from the beginning to step 8. Publish the results of your training on Tableau Public and
post the tableau link to your box by the beginning of class. Call it "TableauTraining.html". It will count as class participation for
the day. You are encouraged to explore/play around the app as you go through the steps, but the final posted link must look very
similar to the one in the training materials.
If you do not find the dataset to do the training already installed, here there is a
copy. In Tableau, go to File > New > Sample - Superstore.xls > drag the order table > click on Sheet 1.
The writer of the training program identifies some insights. Do you agree with all of them? Note down the ones that seems
questionable for class discussion.
#17
As part of the BI wars, Microsoft recently expanded its offerings with PowerQuery, PowerPivot, and PowerBI. Tableau is responding by extending the capability of its own product line and has created a tool to clean and wrangle data before visualizing it. It is called Tableau Prep. Rather than reading a paper on Tableau Prep, we will install it on our computers and use it. Go through the Get Started with Tableau Prep Builder. Follow it from the beginning to the end. Call the final file "PrepOutput" and post it in your box. It will count as class participation for the day. You are encouraged to explore/play around the app as you go through the steps, but the final output must look very similar to the one in the training materials. In class I will ask you about the experience and your thoughts on the tool.
#18
Read Chapter 10 from the beginning to the Example (excluded), and then from Corporate use of big data to the end.
Next, read Appendix 10. Watch How NoSQL Databases work.
Literacy: be prepared to explain these words with your own words, not reading notes from the book. Examples always help: Big data, Data lake, Schema on demand, Schema on read, NoSQL