Homework 10

Tesla's short-term trading data


Goal

By completing this exercise you will learn several "data wrangling" techniques to clean up digital data and preparing it for analysis. 

source:web


Context

“Life is good!” you say to yourself: your promotion to investment analyst was a win, and your new office and salary are big improvements. You feel ready for the next set of challenges. They come right away, in the shape of an excited Ms. Lendalot walking into your new office.
“Here you are! Faaaantastic office! Do you own a Tesla?”
“I do not.”
“Pity… I was hoping that you could help me understand the success of these electric cars.  I am looking at some data that I find rather puzzling.”
“Puzzling?”
“Yes, you see, we have a bunch of customers who buy Tesla shares, keep them only for a few days, and then sell them all…"
“Short-term trading? Are they timing something?"
“Not sure. On top of that, some of them use our US broker, and others go through our French affiliate and purchase them with Euros.“
“ Currency arbitrage?”
“Again, I do not know. I do not even know if they make any money out of all these trades.  Maybe these clients know something we do not…. Anyhow, I want you to explore these trading data and let me know what you think.   By the way: the data is rather messy. Clean it before performing the analysis.”
“I am on it, Boss.”


Task

Ms. Lendalot asked you to examine a set of financial trading data. Unfortunately, the data given to you has many quality issues which you will need to fix before proceeding with the analysis.
This homework has two parts. The first part, described here, is designed to show you examples of automated data cleaning: invalid identifiers, phone numbers that are missing or not consistently formatted, and dates and numbers that are formatted with foreign standards. The second part (next homework) will focus on answering the business question asked by your CFO.
Refer to the class slides to create an Excel app with Visual Studio. As always, the video will point you in the right direction yet it may not be the solution.  The data may be different.


Requirements

  • One button is shown at start, after loading the data six more appear.  After use, each button disappears.
  • A button identifies invalid CIDs (99999) and marks them with a red background
  • A button identifies invalid or missing DOBs and marks them with a yellow background
  • A button identifies invalid or missing Phone numbers and marks them with an orange background
  • All valid phone numbers (valid = ten digits) are formatted as (xxx)-xxx-xxxx
  • A button identifies dates entered with the European standard dd/mm/yyyy, turns them into mm/dd/yyyy, and marks them with blue ink
  • The last two buttons are inactive for now
  • After all steps MessageBoxes communicate how many records were marked/modified, etc. as show in the video.

The financial data is here (right click and save as...).  You asked coPilot about "ingesting a .csv file in VBA" and by strike of good luck you find a snippet of code that does just that.

As usual, the video might show different numbers or a solution that is close but it is not what you have to produce. This is intentional: you need to learn to go beyond watch-and-type and instead figure out what your are asked to implement (the requirements).


Data Dictionary (column & definition)

   A Customer id
   B First and Last name
   C Date of birth
   D Phone number
   E Business unit (US or France)
   F Date bought
   G Total paid
   H Purchase price
   I  No of shares
   J Date Sold
   K Selling price



Solutions to critical errors and crashes