The number one question I get asked is if the capstone was difficult. The short answer is “yes.” The capstone is designed to test the knowledge and skills gained from the classes in the course, as well as expand your horizons as to how you would deal with “real-world” projects. The next question I get is “Do you think I can pass it?” Absolutely… if you give it your all.
The capstone I completed was divided into three parts. * data analytics * building and testing a machine learning model * write a professional report
The Sum of Three Parts
Part 1, Analyzing Data
The capstone really put me to work. Reviewing and analyzing data is very easy and straight forward for me. I really enjoy looking between the lines and finding the patterns that emerge. The capstone allows the student use of any analytics program they desire. For this portion, Excel was my choice, and it proved to be a wise one. Part 1 took mere minutes for me to complete. It was the second part of the capstone where the majority of my time and effort was spent.
Part 2, Machine Learning
In order to create a successful machine learning model, you will need to be proficient in your Google-Fu. Building the machine learning model took more knowledge and experience than I had gained in the course work alone. A large amount of my time was spent exploring various approaches and algorithms. Many hours were spent researching algorithms and trying to figure out the best ways to go about training the unruly model.
I was frustrated, tired, and admittedly ready to give up. REALLY AND TRULY. A thought occurred: “This is why they call it ‘Data Science’. I am sitting here trying to find answers along an untrodden path.” In that moment, I imagined this is what it feels like to be a true scientist. Just as in physical science, data science requires time, patience, research, trial and error.
Not being one to give in, I persisted until the best combination of algorithms was found. It was a profound relief to test the model against my data and to see it be successful.
So many people in the community helped guide me in the direction to find answers. Thank you to all who wrote blogs, tweets, whitepapers, and produced videos. Sharing your unique views and understanding with others makes us a stronger community.
Part 3, Professional Report
Armed with my data, my experiences, and my successful machine learning model, the final step in the capstone is to put it all together in a written report! Holy goodness. This was much more difficult than I had expected it to be. Your grade for the third part is dependent on other students and their assessment of your report. Yet another real-life experience to get you ready for a career as a Data Scientist!
The report took a few days to complete; the last day I worked a solid 30 hours straight on it. I could not sleep; was excited, terrified, stressed, and also just REALLY ready to get this completed. Once my report was submitted, I began reading the other students’ reports as assigned. The reports were outstanding. Those students brought to light information that had not shown itself in my project. Each report was vastly different; the story the data told the other three varied from each other and mine as well. How interesting that we all had the same data sets and somehow all four of us presented completely different stories!
Undoubtedly, the classes in the MPP Data Science course taught me a number of valuable skills while also having the unintended consequence of teaching me some things about…well…me.
When faced with something new and extremely difficult, I learned that I have the ability to rise up and learn and be successful. That the willingness to learn something new can set you apart from others.
At the end of the capstone, while reviewing other students’ reports, I saw opportunities for more learning as their approaches to the same material differed from mine.
Finally, I learned that when we work together for a common goal, we are stronger and smarter together.
You have undoubtedly heard that Data Science is one of the fastest growing fields in the data industry and one of the best jobs in America . While many people are interested in a career in data science, they are afraid it might take more than they have to offer. I was one of these people. I was afraid that I didn’t have the knowledge or mental (or mathematical) aptitude needed for such a career. Being the unwavering person I am, I set a goal to learn more and then went on a search for information. I found the Microsoft MPP in Data Science program and thought “Well, I can at least give it a try.”
*Let me pause here and applaud Microsoft for partnering with EdX.org to assemble and bring in this training… and making it available to anyone and everyone for FREE. You can take and complete these classes for free. The only payment needed is if you decide to complete the classes for verified certificates (needed to complete the MPP Certification).
What it takes
Are you interested in studying data science? Ask yourself these questions:
Do you have an interest in exploring abstract ideas?
Are you a curious person?
Do you feel comfortable seeking for answers in unique ways?
Do you love exploring with new programs and technology?
Are you interested in finding the story within the story?
Are you good at finding patterns where there seem to be only random ideals and images?
Are you interested in working with data?
What is a Data Scientist?
If you answered yes to the above questions, you just might be the next great Data Scientist! Let’s break down what a Data Scientist does. The role of data scientist is a unique one as it requires an ability to think on your feet, think outside the box, be creative with technology, and be somewhat of an entrepreneur. Data science walks the fine line between technology and creative story telling. A data scientist is one who knows how to use various means to pull narratives from data to create a great story. You see, data is not merely a static table of letters and numbers. No, it is much more than just digits in a row. Data is a living, breathing, ever-evolving collection of information that is searching for a way to tell its’ story. Data scientists are curious, technically equipped story-tellers exploring the data landscape for the next great story to share. Sound interesting? If so, keep reading!
Data Science Tools
On my journey to becoming a Microsoft MPP in Data Science, I started where we all start… at the beginning. The very first class in the MPP Course is Introduction to Data Science. This is your typical intro class. It is easy, but very important. This will guide you through what to expect, how to navigate the classes, as well as provide an over-view of the basic concepts and principles on which data science is based.
There are a number of tools in the data science repertoire. For the purpose of this blog, we will focus on the tools one can learn through the Microsoft MPP courses.
Analyzing & Visualizing Data
The first tool we look at is for analyzing and visualizing data. The MPP course gives you a choice between working with Power BI or Excel. As I have previous experience with Excel and feel pretty confident there, I chose to learning something new and went with Power BI. I found Power BI to be a super fun tool that felt more like a video game and less like work. I love a good visual! This class easily walked me through setup and through a variety of use-case scenarios. I found it very fun and easy to learn. In fact, what struck me the most about these classes is how very concise yet easily followed the class are.
Communicate Data Insights
Now that you understand the basics of analyzing and visualizing data, it is important to know how to master data communication. It is one thing to be able to look at data and understand it, it takes a completely different set of skills to convey the stories the data has to tell. In the next course, Analytics Storytelling for Impact, you will learn how to fully explore a story to find what a great story is, and what it is not. This course really dives into how to make an impact through storytelling and gives you an idea how to create impact through presentations, reports and how to apply these skills to your data analytics. I thoroughly enjoyed this class as it spoke to the theater major in me. I do love to tell a good story, and this class gave me new ways to look at data and has resulted in me questioning things I see every day like political polls, job descriptions, and advertisements.
Apply Ethics and Law in Analytics
Ethics? What does ethics have to do with being a data scientist? Admittedly, when I first saw that the program had been updated with Ethics and Law in Data and Analytics I was a bit taken aback. I thought I had left the legal field and was on the way to a technical role. Why learn ethics? Data science, and data collection have changed wildly and quickly over the last few years. It is my firm belief that every data professional needs to take this course. Only through taking this course I learned about the possibility of data being accidentally prejudice! Certainly ethics should be considered when collecting and analyzing data! The data scientist would be remiss in not heeding due diligence!
Query Relational Data
The data scientist must know how to query databases in order to get the data needed to analyse. The MPP program offers Querying Data with Transact SQL where you will learn to query and modify data in SQL Server or Azure SQL using TSQL. If you are not familiar, SQL is pronounced in the industry as “See-Quil” not “Es-Que-El”… it is a pet peeve of mine to hear someone say S-Q-L when talking to me about SQL. This course was very thorough and a great way to step into learning how to query and program using TSQL. This class will take some effort, I found it to be one of the more intensive classes in this course. SQL is no easy task, and SQL Server has many versions out there in practical use, each version with different hurdles to jump. This particular class is a fantastic place to start to learn a great deal about SQL.
Explore Data with Code
The next step in the program is to explore data with code. You are given two options here, one path is Introduction to R for Data Science and the other is Introduction to Python for Data Science. For my interests, I chose Python since it is widely used in many areas, especially advanced analytics and AI. To my surprise, Python was a lot of fun to learn. I did more research into the uses of Python and found it to be a very useful tool in my toolkit. I can even design and program holiday lights for my house using Python!
Apply Math and Statistics to Data Analysis
Whoa, wait….math? Math is involved??? Yes, absolutely! Remember back in school when you thought “When will I EVER use this again in real life?” The answer is “Now, and always, honestly.” There are three classes offered here so you can choose which you want to learn:
I chose the Python Edition to continue on my usage of Python from the last class. I was not a great math student, so I was really afraid I would not be smart enough to get through this class. If you are feeling that way, stop that now. Like I have said before, these classes are designed in such a great way that not only was I able to learn and grow, I made a great grade! Don’t let fear of failure keep you from trying something new.
To be honest, I faced this particular class with dread. Much to my surprise, I really and truly enjoyed learning about building machine learning models. You can chose between Principles of Machine Learning: R Edition and Principles of Machine Learning: Python Edition . If you have previously chosen Python as I did, continue on with that path. This class offers a clear explanation of machine learning theory through hands-on experience in the labs. You will use Python or R to build, validate and deploy machine learning models using Azure Notebooks.
I will make one suggestion though, before completing this class, would recommend completing Developing Big Data Solutions with Azure Machine Learning . As a more visual-based person, I found that I understood the machine learning models much more after completing the course using Azure Machine Learning.
Build Predictive Solutions at Scale
Okay, now we are getting to some really fun stuff! I think this was my absolute favorite of all the classes. You can chose from one of these three:
I chose Developing Big Data with Azure Machine Learning (AML) and what a blast I had! I can say that working with AML and with Azure Data Studio was like opening up presents on my birthday! The final projects were a lot of work, but I got a real sense of what working in the field as a data scientist and machine learning is all about… trial and error. It was a lot of fun trying to use insights, hunches, best guesses, and technology all together to create and train a model in order to accurately predict solutions!
After all the courses are completed and passed, you can only gain the MPP in Data Science if you successfully pass the Microsoft Professional Capstone : Data Science. As of the writing of this blog, I am slated to begin the Capstone on December 31, 2018 and I cannot think of a better way to ring in the new year!
I have researched many ways to become a Data Scientist. Most universities offer degrees in data science. I have found that on the majority of their sites, they tout a Masters or PHD in Data Science is what you need (with a heavy prerequisite of extensive math and stats classes) in order to become a data scientist. Must you have an advanced degree in mathematics or engineering to become a data scientist? Absolutely not. You don’t even have to hold a degree to work as a data scientist! Take a look at this article published on Forbes: 4 Reasons Not To Get That Masters In Data Science
My advice is to take a look at the Microsoft MPP program and try on a few of the free classes. If you are truly interested in a data science career and are willing to put forth the time and attention needed to learn, you already qualify as a good candidate. Don’t let your past dictate your future. Make the investment in yourself and grow along with the technology as it comes. You can do this!