Python is one of the applications we have been introduced to lately in our Data Analytics (DA) course. It is one of the most popular programming languages used by developers and non-developers worldwide. It has been used to develop Netflix’s recommendation algorithm and the software that controls self-driving cars. It was created by Guido van Rossum in 1989 and had its first major version 1.0 released in 1994. The name “Python” came to Guido by being inspired when reading scripts from BBC’s Monty Python’s Flying Circus. The most recent version, Python 3.9, with newly added features was released in 2020. Python is a general-purpose language that can be used in many applications, including software development, web development, machine learning and data science.
The interface of Python looks like one of those things out of the Matrix movie, however, it is beginner-friendly and easy to use. Python’s versatility, its beginner-friendliness, along with its large supportive community has made it one of the most-used programming languages today. Some key aspects of Python include:
- Readability: The rules that control the structure of the symbols and words of the programming language are easy to read and understand.
- Versatility: Python can be used to create a plethora of programs; it is not fixed to one particular task.
- Community support: Python has a vast active community of developers who contribute to open-source projects, and provide tutorials and hands-on practice exercises.
- Large standard Library: Guess what? You do not have to code from scratch with Python, it comes with a standard library that has modules and packages for a wide range of tasks.
- Data science and Machine learning
- Web development: It is widely used for building many web applications.
Another topic that we have covered recently in our DA class is Probability & Uncertainty. In our last class we looked at what probability means and some of the different types of uncertainty we have in data analysis.
Probability is the numerical measure of the likelihood or chance that an event will occur. It must be expressed between 0 and 1, where 0 is an impossibility and 1 indicates certainty and the numbers in between represent the degree of likelihood of an outcome. In statistical experiments, probability determines outcomes. However, probability is different from uncertainty. Uncertainty has to do with the lack of complete knowledge about certain aspects of the data involved or it refers to epistemic situations with imperfect or unknown information (Wikipedia). We learnt about 3 types of Uncertainty namely
- Aleatoric Uncertainty: this refers to probabilistic uncertainty, randomness in a system that cannot be eliminated even with complete information. An example of Aleatoric uncertainty is weather forecasting.
- Epistemic Uncertainty: this is associated with a lack of knowledge, here you are not sure if the expertise or theory will work.
- Knightian Uncertainty: named after economist Frank H. Knight, it has to do with radical uncertainty, it goes beyond the traditional concepts of risk and probability.
I find it interesting that we moved on to quite a number of these philosophical concepts in our Data Analytics class. I did not expect it but cheers to more new learnings, I hear that another data analytic tool called ANACONDA is around the corner. I do wonder why they like to use the names of snakes for these applications, I guess that would be a blog post for another day.
#MMBA5