"Whenever there's a buzzword and a 'complex' subject matter, it's usually good to start with the definition..."
What is Data Science?
The term Data Science, which was introduced around the 1980s remains confusing for many people even now.
"Data Science is not about making complicated models; it's not about making awesome visualizations; it's not about writing code..." - Joma Tech
Then what is it?
#it's up to you
if YouWant == True:
YouCan
else:
YouCan't
History of Data Science
Early in the days, before Data Science became the sexiest job of the 21st century, the popular term was Data Mining. In the article 'From Data Mining to Knowledge Discovery in Databases' in 1996, Usama Fayyad, Gregory Piatetsky-Shapiro and Padhraic Smyth referred to Data Mining as the overall process of discovering useful knowledge and patterns from data by use of specific algorithms.
In 2001, William S. Cleveland decided to combine Computer Science and Data Mining by making statistics a lot more technical believing that it would expand the possibilities of Data Mining and produce a powerful force of innovation.
Nowadays, you can take advantage of computing power for statistics which is now known as Data Science.
Data Science is an interdisciplinary field where its true foundations are in Statistics, Mathematics, Computer Science, and business too
"I wanna become a Data Scientist, where should I start?..."
Many would say statistics, others math, others programming...
Best answer? Let's find out.
Assuming Data Science is an Ocean...
Step 1 - Sail around the Sea of 𝗣𝗿𝗼𝗯𝗮𝗯𝗶𝗹𝗶𝘁𝘆 & 𝗦𝘁𝗮𝘁𝗶𝘀𝘁𝗶𝗰𝘀 📈
Someone once said, "A Data Scientist is just an overpaid Statistician". How true is that? Statistics is the science concerned with developing and studying methods for collecting, analyzing, interpreting and presenting empirical data
. In both Data Science and Statistics, our main concern is Data. How we get insights from the data collected, the modes of collecting it, manipulating it... (Data Wrangling).
Hence, when equipped with the Probability and Statistics tools, one will find it easy working with the data, finding its correlation, and getting useful insights from it.
There are awesome free resources online and awesome platforms to learn Probability and Statistics including Probability and Statistics: To p or not to p? on Coursera, Statistical Learning on edX and an awesome book by Gareth James, Daniela Witten, Trevor Hastie and Rob Tibshirani, An Introduction to Statistical Learning.
Step 2 - Drift to the Bay of Computer Science 🖥️
There are two main programming languages for Data Science (Python and R). Python unlike R is more popular and highly flexible, helping one transition to different domains like Web Development, Web Scraping, Embedded Systems etc.
There are several online free resources to start your Python Programming journey, which includes the most popular Python Programming tutorial on YouTube
Platforms like DataCamp and Coursera too have awesome courses to get you started in programming.
Ready to venture into deeper and stormier waters 🌊?
Step 3 - Machine Learning and Deep Learning
Recently, I published an article on Understanding Machine Learning and its Basic Workflow where I've gone deeper into explaining what Machine Learning is and its Complete Workflow in the Real-World.
In Machine Learning, you'll learn how to build models and use them for prediction, and also the math of what happens under the hood.
For more on Machine Learning and Deep Learning, I find Daniel Bourke's blog enriched with useful information which includes his Machine Learning Roadmap clip on YouTube.
Wooow!! We've just discussed everything in a nutshell;
Where can I find the rest?...
>Google has it all in store for you!!
In conclusion, 365 Data Science are offering a free "Starting A Career In Data Science:Ultimate Guide" to help you understand more concepts in the field of 'Data' including Data Engineers and Data Analyst and also explaining their roles in depths.
Top comments (0)