Variance Analysis at Acme Global
Individual pledged graded assignment MP8
Goal

Learning how to use pivot tables in a Business Intelligence / Data Warehouse context
Loading a pivot table from disparate data in a database
Writing business intelligence SQL queries that mimic the outcome of a pivot table, and understanding the similarities and differences between Pivot and SQL.
Context
You are an analyst at ACME GLOBAL, a large global corporation. It is year-end and the CFO (your boss) wants you to perform a variance analysis of actual-versus-budgeted expenditures that are reported worldwide by the many ACME branches. A staff member in the IT department tells you that all the five ACME geographical regions have submitted their data in five tables in your DB at HQ (COMM3220). The tables are
ACME_N_AMERICA, ACME_S_AMERICA, ACME_PACIFIC_RIM, ACME_EUROPE, and ACME_ASIA. Note: all figures in 000.
At first you thought that you could do the variance analysis in plain Excel. You planned to compare actual and budgeted costs one by one, but then you realized that each of the five ACME regions has about 3,000 rows of financial data. It did not take you much to conclude that you do not want to spend a week doing this assignment, and that a better tool would be needed to speed up your work. Good news: both pivot tables and SQL can do the job.
Task
You will answer some questions twice, first using a Pivot table, and then using SQL.
1. Create a notebook called "MP8 YourLastname YourFirstName.ipynb".
2. Using SQL, Integrate the data from the five ACME regions into a single VIEW. No need to create an intermediate Table as you have done in the past. The view shows all the rows form the five tables and adds a new column called "Variance", where Variance = Budgeted-Actual expenses. Name the view <yourUserId>.ACME_GLOBAL_DATA.
3. Add a SELECT * for the view, and a DROP IF EXISTS for the view at the top of the notebook. This is for the TA to check your work. Do not drop the tables with the data from the five regions.
4. In Excel, go to Data > Get data > From Database > From SQL Server database > enter the sever name (f-sg6m-s4.comm.virginia.edu) and DB (COMM3220) > If necessary, select "Database" to enter your DATABASE credentials (not NetBadge nor CommBadge). OK if not encrypted. > Select the view that you just created > Click on "Load to" (not "Load") and choose "Table" and an appropriate place.
5. Click anywhere inside the Table you just created, then go to Insert > Pivot Table.
6. If you are not an expert with Pivot Tables, watch this video. It is 20 minutes long, but it is worth watching to the end because it will teach you how to use Pivot tables for this assignment and - more importantly - for your next job.
7. Use the Pivot Table to answer the business questions Q1-Q8 below. For each question, create a concise Pivot Table that answers the question, highlight in yellow the cell/s that contain/s the answer, and take a screenshot of the entire pivot table. Windows has an easy-to-use utility called 'Snip & Sketch'. MacOS has 'Screenshot'. A clear pic with your phone will also do.
8. Save the query texts and the screenshots in a .docx file. Call it "MP8 YourLastname YourFirstName.docx" and post it in your BOX folder.
9. Add to the notebook created in step 1 the text of the queries Q1-Q6 (yes: only Q1-Q6) and the SQL that answers them as you have done in the past: q1 text, then SQL, then q2 text, then SQL ... etc.
10. Post the notebook "MP8 YourLastname YourFirstName.ipynb" in your BOX folder.
Criteria
PIVOT: The pivot must show only what is asked, and each screenshot should show the entire pivot. Format all $ numbers ('$', commas, but no decimal). Ok if you show grand totals. Answer each question with one pivot table.
SQL: Your SQL must produce as much as possible the same data shown in the pivot, including formats. All columns must have easy-to-understand names (ok if not identical to the pivot).
Business questions
Note: variances are signed numbers. Negative variances are cases where actual expenses are larger than the corresponding budgets. So "a large negative variance" means a negative number with a large absolute value. By contrast, a variance is "small" if it is close to $0, no matter if it is positive or negative.
Q1. Which division had the largest total negative variance in May? Show all total variances for May, largest total negative variance on top. Do not show other months.
Q2. Within that division and month, which department? Show all total departmental variances, largest total negative variance on top.
Q3. Which department had the most accurate bonus budget across the year and globe? Show all departmental bonus budgets and their variances. (for the SQL, show the most accurate on top)
Q4. What are the top two sources (categories) of negative variance for R&D worldwide? Make the Pivot table only show the top two. If you cannot figure it out, show them all.
Q5. Show actual, budget, and variance totals for PR in Europe and Asia. Do not show data about other divisions.
If you cannot figure it out, show all Divisions.
Q6. Why is HR in Asia so over budget? Show the items, sorted by their contribution to the overage. Think about what 'contribution to the overage' means.
--- IMPORTANT: The true power of PIVOT is revealed when we use the CONTENT of cells in a given column to create new columns. Make sure you understand what you just read! For example, answering Q7 below requires to use the CONTENT of the cells in the 'month' column to create new columns ('Jan', 'Feb, 'Mar'....). This is trivial to do in the pivot table, but hard to do in SQL. SQL cannot easily do that, at least not without extensions and workarounds. T
Q7. Produce a table showing total variances by division (row) and by month (column). No highlight necessary.
Q8. Produce a table showing actual expenditures by department (row) and geographical area (column). No highlight necessary.
NOTE:
Appreciate that the assignment is not only about producing the right answers but also figuring out how to produce them using the given tools. As always, Google and a healthy does of trial and error are your friends here.
If you are trying to do this project on a Mac (especially older versions):
It might not work. Three alternative solutions:
1) Go to lab in RRH and use a windows PC
2) In at least one case a Mac user found a solution: the Mac Excel can connect to the COMM3220 DB, but it cannot see the Views
that you create. So, as a fix, create the view (to satisfy the requirements of the project) but also create a table that looks
like the view (so Excel on your Mac can see it).
This will create a table identical to the view (you will need to call it something different). At that point, your Mac
should be able to connect and see the table, which will allow you to complete the project. Do not report in your homework the SQL
for the table; just the SQL for the VIEW.
3) If you cannot connect to the server.
a) Go to ADS and select * from the view (make sure you visualize all the rows)
b) download to your PC the results from the select using the little icon close to the result of the select
c) load in Excel the file produced by ADS. Turn the data into a Excel table and you are good to go.