MPP Data Science Final Project

Recently I wrote about my experience working through the Microsoft Professional Program in Data Science in my blog post Data Science Breakdown. I am so happy to announce that I successfully completed the capstone! More-so, I achieved the MPP Data Science Certificate!

“Was it difficult?”

The number one question I get asked is if the capstone was difficult. The short answer is “yes.” The capstone is designed to test the knowledge and skills gained from the classes in the course, as well as expand your horizons as to how you would deal with “real-world” projects. The next question I get is “Do you think I can pass it?” Absolutely… if you give it your all.

The capstone I completed was divided into three parts.
* data analytics
* building and testing a machine learning model
* write a professional report

The Sum of Three Parts

Part 1, Analyzing Data

The capstone really put me to work. Reviewing and analyzing data is very easy and straight forward for me. I really enjoy looking between the lines and finding the patterns that emerge. The capstone allows the student use of any analytics program they desire. For this portion, Excel was my choice, and it proved to be a wise one. Part 1 took mere minutes for me to complete. It was the second part of the capstone where the majority of my time and effort was spent.

Part 2, Machine Learning

In order to create a successful machine learning model, you will need to be proficient in your Google-Fu. Building the machine learning model took more knowledge and experience than I had gained in the course work alone. A large amount of my time was spent exploring various approaches and algorithms. Many hours were spent researching algorithms and trying to figure out the best ways to go about training the unruly model.

I was frustrated, tired, and admittedly ready to give up. REALLY AND TRULY. A thought occurred: “This is why they call it ‘Data Science’. I am sitting here trying to find answers along an untrodden path.” In that moment, I imagined this is what it feels like to be a true scientist. Just as in physical science, data science requires time, patience, research, trial and error.

Not being one to give in, I persisted until the best combination of algorithms was found. It was a profound relief to test the model against my data and to see it be successful.

Giving Thanks

So many people in the community helped guide me in the direction to find answers. Thank you to all who wrote blogs, tweets, whitepapers, and produced videos. Sharing your unique views and understanding with others makes us a stronger community.

Part 3, Professional Report

Armed with my data, my experiences, and my successful machine learning model, the final step in the capstone is to put it all together in a written report! Holy goodness. This was much more difficult than I had expected it to be. Your grade for the third part is dependent on other students and their assessment of your report. Yet another real-life experience to get you ready for a career as a Data Scientist!

The report took a few days to complete; the last day I worked a solid 30 hours straight on it. I could not sleep; was excited, terrified, stressed, and also just REALLY ready to get this completed. Once my report was submitted, I began reading the other students’ reports as assigned. The reports were outstanding. Those students brought to light information that had not shown itself in my project. Each report was vastly different; the story the data told the other three varied from each other and mine as well. How interesting that we all had the same data sets and somehow all four of us presented completely different stories!

Lessons Learned

Undoubtedly, the classes in the MPP Data Science course taught me a number of valuable skills while also having the unintended consequence of teaching me some things about…well…me.

When faced with something new and extremely difficult, I learned that I have the ability to rise up and learn and be successful. That the willingness to learn something new can set you apart from others.

At the end of the capstone, while reviewing other students’ reports, I saw opportunities for more learning as their approaches to the same material differed from mine.

Finally, I learned that when we work together for a common goal, we are stronger and smarter together.

We are stronger and smarter together.

Power BI Data Gateway

What is a Data Gateway in Power BI?

When creating reports in Power BI, the end goal is to make them useful to many users. In order to share reports created in Power BI, they must be published to the cloud (known as PowerBI.com). Once nestled in the cloud, the data in the reports will either stand static, or will need to be updated on a regular basis. In order to refresh data and keep end users in up to date, the cloud must have access to data sources. This is where you need a Data Gateway. Think of a data gateway as a bridge between your on-premises data sources and the cloud.

A gateway should be installed on a machine that is always on and connected to the internet. Gateway cannot access information from a machine that is powered off or loses internet.

  •  Before installing, take into consideration that if you are installing on a laptop and it is turned off, not connected to the internet, or asleep, the gateway won’t work and the data in the cloud will not sync with your on-prem data. Also, if the machine on which the gateway is installed is connected to a wireless network, the gateway may perform more slowly and that will take longer for the data to sync with the cloud and your on-prem data.

Power BI Gateway can be installed in two ways:

  • On-premises data gateway – This gateway can be used by any user that has access to the server on which the gateway is installed. It can be used for scheduling refreshes and live queries.
  • On-premises data gateway (Personal mode) – This gateway is can only be used by the person setting up the gateway. This mode is only used for scheduling refreshes in Power BI. At the time of writing, Live Connection, DirectQuery, Power Apps, Logic Apps, and Microsoft Flow are not supported.

Only one gateway in each mode can be installed on one machine. That is, you may install one gateway in personal mode, and another in regular mode. You cannot install two or more personal mode gateways on one machine. You can, however, manage multiple gateways from the same interface on Power BI.

Installing a Gateway

To install a gateway, you will first need to sign on to PowerBI.com. Take note that this is NOT the desktop app, this is the cloud-based service. Look at the top right on the menu bar, click on the icon that looks like an arrow pointing down. The dropdown will reveal several actions. You will want to choose ‘Data Gateway’.

SettingupGW

This will take you to a new webpage where you will be able to start your Gateway download. Click on the DOWNLOAD GATEWAY button and wait for the download to begin. Once the Download Installer has finished, open up the exe and follow instructions.

DownloadPage

When the installer opens, you will be ready to start setting up your gateway.

GatewayInstaller

Click NEXT to choose the type of gateway you need.  Before you choose, take into consideration the role of each. Remember that the Personal mode is only useful for on demand refresh and scheduling refresh in PowerBI and cannot be used for Live or DirectQuery. On-premises data gateway can be used by multiple users and does support both schedule refresh and DirectQuery.

Please note the following in regard to installing either mode:

  •  both gateways require 64-bit Windows operating systems
  •  gateways can’t be installed on a domain controller
  •  you can install up to two On-premises data gateways on the same computer, one running in each mode (personal and standard)
  •  you cannot have more than one gateway running in the same mode on the same computer
  •  you can install multiple On-premises data gateways on different computers, and manage them all from the same Power BI gateway management interface (not including Personal mode)
  •  You can only have one Personal mode gateway running for each Power BI user. If you install another Personal mode gateway for the same user, even on a different computer, the most recent installation replaces the existing previous installation.

ChoosePersonalorNot

Once you have chosen your mode and clicked next, it will take a a few seconds for it to download and get ready to install your gateway.

The next step is to point the download to the drive on wish you want the install to go. You will want the Gateway positioned as close to your data source as possible. Be sure to read and accept the terms of use and privacy statement.

Upon successful installation, you will need to add an email address to use with this gateway. Next you will need to sign in.

SucessfulInstall

We have successful installation of our Gateway! Now you will have the option to configure a new gateway, migrate, restore, or take over an existing gateway. Here we will register the data gateway.

Register On-Prem

To configure a new gateway, you will need to enter a name for the gateway, enter a recovery key (minimum 8 characters) and finally, select Configure. Be sure to store your recovery key in a safe place. You will need it in the future if you ever need to migrate, restore, or take over a gateway.

ConfigureGateway

Congratulations, you now have a successful installation and configuration of Gateway! Now you will be able to connect to on-prem data sources! For use with Power BI, you will need to add your data sources to the gateway within the Power BI service. This is done by going to the menu bar, clicking on the gear icon and choosing MANAGE GATEWAYS from the drop down. We will cover adding data sources in the next blog!

*For a more in-depth look at Gateway installation, information can be found on Microsoft Docs.