Introduction to Machine Learning: Breaking Down the Basics

In 2023, “AI” has been declared the word of the year by the Collins Dictionary. According to the publishers, the use of this term has quadrupled. It can be asserted that 2023 will be remembered as the year that ushered in a new era of digital technology.

Wherever we turn, the presence of AI is evident in our daily lives – whether it’s in the creation of personal photos, video dubbing, the latest versions of company chatbots, or even in the new Beatles song playing on radio and music streaming platforms. This leads us to a question posed long ago by the mathematician and computer scientist, Alan Turing:

Can machines think?

This query forms part of a technical exercise proposed by the scientist in his 1950 article, famously dubbed the imitation game. In this game, a human judge engages with both a machine and a human without knowing which is which. If the judge cannot reliably distinguish between them based on their responses, the machine is deemed to have passed the Turing Test, showcasing a degree of artificial intelligence. The objective is to evaluate a machine’s capability for human-like conversation and behaviour.

This test serves as a potential origin for what we now recognize as machine learning. The prospect of encoding thoughts on computers, akin to those of living beings, marked a significant milestone for humanity. Presently, this concept finds application in diverse areas, with certain tasks exhibiting superior performance compared to those carried out by humans.

Decoding the Jargon

Here is my selection of terms that often confuse:

  1. Artificial Intelligence (AI): The expansive field aiming to develop intelligent machines capable of emulating human cognition.
  2. Machine Learning (ML): A branch of AI that concentrates on algorithms and statistical models, empowering systems to discern patterns and make decisions without explicit programming.
  3. Deep Learning: A specialized variant of machine learning that utilizes neural networks with multiple layers to extract high-level features from data.
  4. Statistical Learning: The broader concept encompassing machine learning, emphasizing the utilization of statistical methods to formulate predictions or decisions.

Machine learning

Tom Mitchel once stated, “A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.​”. This might sound complex but let’s simplify it.

Imagine creating a program to predict the accumulated precipitation in the next hour based on past data. The task (T) here is to estimate the precipitation accumulation for the upcoming hour, with the performance (P) measure being some error metric, such as the difference between the predicted and observed values. The experience (E) involves various attempts to make the forecast. The program learns as its prediction approaches the observed values during these experiences. The program learns as its predictions approach the observed values during these experiences. The process by which the program learns is linked to a predefined set of configurations known as hyperparameters.

Types of Machine Learning

In general, there 3 types of machine learning:

Supervised Learning

In this paradigm, the model is provided with a dataset and already knows what the correct output should resemble; in other words, each given example has an associated label or target. A model based on supervised learning endeavours to identify the mapping from input to output, allowing it to offer precise predictions when presented with news, unseen data. This is particularly applicable in image recognition, speech recognition, and spam filtering scenarios.

Supervised learning algorithms can be categorized into regression and classification problems. In regression tasks, the model must aim to fit a function that best approximates the input data with the output data. Classification models seek to fit a function that best distinguishes a set of categorical variables.

Let’s consider a scenario where a botanist collects measurements associated with iris flowers, including the length and width of the petals and the length and width of the sepals, all measured in centimeters. These iris flowers have been previously identified by an expert botanist as belonging to the species setosa, versicolor, or virginica. If we want to build a machine learning model that can learn from the measurements of these irises, whose species is known, so that we can predict the species for a new iris, we are dealing with a classification problem. This is because we aim to categorize new irises based on a labeled dataset.

Now, imagine that we want to create an algorithm that predicts the price of a house based on its size and location in the real estate market.  Price as a function of size and location is a continuous output, so this is a regression problem.

Unsupervised Learning

In contrast, unsupervised learning is a technique that tackles problems with little or no prior knowledge of what our results should resemble, using unlabeled data. This technique follows the outlined flow below:

So, imagine you have a basket of various fruits, but you don’t know which fruits belong to which category. Through unsupervised learning, the algorithm might group the fruits based on similarities in features like shape, color, and size. The algorithm, without any prior knowledge of specific fruit names, autonomously identifies clusters, revealing, for instance, that apples, oranges, and bananas share certain characteristics.

Reinforcement Learning

This subset of machine learning enables an AI to acquire knowledge through experimentation and feedback from its actions. This feedback can be either negative or positive to maximize cumulative reward. 

In a certain sense, we can say that RL shares similarities with supervised learning when it involves mapping between input and output. However, in RL, the agent autonomously decides what actions to take to accomplish a task correctly. 

This approach finds significant application in games like chess, where an agent refines its strategy based on accumulated experiences over time. Consider another example: suppose we want to develop an algorithm that guides a robot to explore and clean a room. It receives positive reinforcement when it successfully cleans a dirty area and experiences negative reinforcement when encountering obstacles or failing to clean certain areas.  Through this feedback loop, the robotic vacuum learns to navigate efficiently, avoiding obstacles and optimizing its cleaning strategy over time.

Conclusion

In conclusion, delving into the realm of AI is akin to embarking on a journey of continual adjustments and twists. Changes don’t happen in the blink of an eye; they’re more like a slow burn. Yet, many individuals overlook these shifts. The trick? It’s all about hitting the books, maintaining a vigilant eye on the everyday grind, and giving things thoughtful consideration. These skills aren’t just useful; they’re the secret sauce for staying on the AI adaptation rollercoaster. No quick fixes here; it’s an ongoing commitment. So, let’s keep our learning hats on, stay curious, and ride the waves of AI’s ever-evolving journey!

Warning: This article was written with AI help 😉

Joyce Araujo

Sr. Software Engineer

References:

Mitchell, Tom M. 1997. Machine Learning. First. McGraw-Hill Science/Engineering/Math.

Turing, Alan 1950 https://academic.oup.com/mind/article/LIX/236/433/986238

BBC News, AI named word of the year by Collins Dictionary https://www.bbc.com/news/entertainment-arts-67271252

Andreas C. Müller & Sarah Guido. Introduction to Machine Learning with Python: A Guide for Data Scientists.

York University, what is reinforcement learning https://online.york.ac.uk/what-is-reinforcement-learning/ 

Unsupervised learning image

https://nixustechnologies.com/unsupervised-machine-learning/

[🌷✨ May this month-end be filled with renewal, cooperation, 🌟]

The 10 Most Coveted Professions by AI in Colombia

Bogotá (Colombia), February 2024. The exciting universe of artificial intelligence (AI) unfolds every day as a stage full of extraordinary professional opportunities. From automation to advanced decision-making, intelligent technologies are completely reshaping the way we live and work globally.

Indeed, Colombia is part of this revolution, offering multiple job opportunities in the development, maintenance, and improvement of these technologies. These jobs are not only on the rise but also redefine the traditional notion of labor, creating a new work paradigm.

It is worth noting that the scope of artificial neural networks goes beyond this labor transformation: it is a source of inspiration for young people to dive into STEM careers. Precisely, the motivation to enroll in software-related programs has been experiencing constant growth for the past couple of years, driven by the allure of actively participating in the creation and evolution of innovative technologies. Similarly, the possibility of contributing to change and being at the forefront of the tech revolution motivates these enthusiasts to explore a range of opportunities.

Colombia is part of this revolution offering multiple job possibilities in the development, maintenance, and enhancement of such technologies. These jobs are not only on the rise but also redefine the traditional notion of work, creating a new work paradigm.

It’s worth noting that the scope of artificial neural networks goes beyond labor transformation; it serves as an inspiration for young people to dive into STEM careers. Specifically, the motivation to enroll in software-related programs has been experiencing steady growth for the past couple of years, driven by the allure of actively participating in the creation and evolution of innovative technologies. Similarly, the opportunity to contribute to change and be at the forefront of the tech revolution motivates these enthusiasts to explore a range of educational alternatives linked to science, technology, engineering, and mathematics.

Now, the job vacancies sought by Artificial Intelligence not only offer competitive salaries but also provide unparalleled benefits. From fostering mental well-being and maintaining a balance between personal and professional life to constant immersion in multicultural teams at large companies, the professional journey in this field unfolds as a fascinating path to success.

Within this environment, certain careers emerge as the most coveted by artificial intelligence. The demand is booming, and the opportunities are endless for those with experience as:

  1. Full Stack Developer
  2. DevOps Engineer
  3. Data Scientist
  4. Cybersecurity Expert
  5. Mobile Application Developer
  6. Machine Learning Engineer
  7. Robotic Process Automation (RPA) Specialist
  8. Data Analyst
  9. Network Engineer
  10. Systems Architect

According to Diego Gamboa, Chief Technology Officer at the software consultancy firm Mismo, which currently has relevant job openings, “artificial intelligence reshapes the business landscape and initiates a significant transformation in the very essence of professionals by automating tasks, enhancing strategic decision-making, and opening new frontiers of innovation. This marks the beginning of an era where collaboration between humans and machines redefines the very concept of efficiency and business excellence.”

“This revolution is not simply about adapting to a new business environment; it propels a fundamental shift in skills, perspectives, and roles of individuals in the workforce, ushering in a new era of adaptability and vision in the evolution of careers,” explains the executive.

Those interested in applying for these opportunities should have a minimum English level of B2 according to the Common European Framework of Reference for Languages (CEFR), demonstrate a motivation for constant learning, and cultivate soft skills such as effective communication and creative problem-solving. Adaptability and ethical data management are also indispensable.

There are several platforms that offer possibilities for those seeking jobs in these areas, with Mismo Remote Jobs being one of the prominent ones. In this space, applicants not only have access to job vacancies but also find guidance and support in their international job search.

Practical Introduction to Data Science in Python

Data science has emerged as one of the fastest-growing and most exciting fields in the world of technology today. With the increasing amount of information generated across every aspect of our lives (from our cell phones, social media, online banking, etc..), data scientists have become critical to the success of businesses around the globe because they understand the underlying business problems and can translate it into actionable recommendations for decision makers. In the past, data analysis was a tedious and time-consuming process, but with the rise of advanced tools and techniques, data scientists can now quickly and accurately analyze and interpret data.

Python is one of the most widely used programming languages in data science, thanks to its user-friendly syntax and extensive libraries that make analysis and visualization easier and more efficient. Python offers a range of powerful tools and libraries that make dataset manipulation, analysis, and visualization straightforward and efficient.

In this article, we’ll briefly introduce you to some of the essential tools for data science in Python, including Jupyter Notebooks, Pandas, Matplotlib, and scikit-learn. We’ll provide examples of usage for each library.

Jupyter Notebooks

This is an essential tool for data scientists and Python programmers alike. They provide an interactive environment for writing and executing code, as well as visualizing and sharing data. They also have many features that make them valuable tools for data scientists. For example, you can include markdown text in your notebook, which allows you to add notes, explanations, and visualizations to your code. You can also add visualizations and charts using Python’s Matplotlib or other libraries. 

Pandas

Pandas is a popular Python library for data manipulation and analysis. It offers data-structures and functions that facilitate its analysis and manipulation.

One of the most important data structures in Pandas is the DataFrame. A DataFrame is a 2-dimensional labeled data structure with columns (like a table) of potentially different types. 

Now, let’s say we want to group the data by the Gender column and calculate the mean age for each group. We can use the groupby method to achieve this:

In some cases, our data may contain missing values (NaN). We can drop these values using the dropna method:

These are just a few examples of what you can do with Pandas. The library offers many more tools and methods for manipulating and analyzing data, including filtering, merging, and transforming data. 

Matplotlib

Matplotlib is a popular data visualization library for Python that provides a variety of tools for creating high-quality visualizations. With Matplotlib, you can create a wide range of charts, plots, and graphs, including scatter plots, line plots, bar charts, and more.

Some examples of different plots include:

 

  • Scatter Plot: A scatter plot is a great way to visualize the relationship between two variables.
  • Bar Chart: A bar chart is a great way to visualize categorical data
  • Histogram: A histogram is a great way to visualize the distribution of a dataset. 
  • Line Plot: A line plot is a great way to visualize the trend of a dataset. 

Scikit-learn

Scikit-learn is a powerful machine-learning library for Python that provides a wide range of tools for data mining, analysis, and modeling. It is built on top of other popular scientific Python libraries, including NumPy, SciPy, and Matplotlib, and provides an easy-to-use interface for building machine learning models.

Scikit-learn includes a variety of machine learning algorithms, including regression, classification, clustering, and dimensionality reduction. It also provides tools for feature extraction and selection, data preprocessing, and model evaluation. With Scikit-learn, you can build and train machine learning models on your data, evaluate their performance, and use them to make predictions.

Conclusion

Python is a versatile programming language that offers a range of powerful tools for data science. We introduced you to some of the essential libraries and tools for data analysis, manipulation, and visualization in Python. By mastering these tools, you’ll be well on your way to becoming a proficient data scientist in Python. 

As technology evolutions, we can expect to see more powerful and sophisticated algorithms that can analyze and interpret vast amounts of data. Additionally, we may see increased adoption of machine learning and AI technologies in various fields, such as healthcare, finance, and transportation, to name a few. With these advancements, we can expect data science to play an even more crucial role in decision-making processes, innovation, and problem-solving across industries.

Édgar Alexander Dávila

Software Engineer

References:

Altintas, I., Porter L. (2022). Python for Data Science [MOOC], UCSanDiegoX DSE200x [Online course]. edX.
https://learning.edx.org/course/course-v1:UCSanDiegoX+DSE200x+3T2022/home 

Parenthetical citation: (Altintas et al., 2022)

Narrative citation: Altintas et al. (2022)

 

Silicon Valley offers remote job opportunities for Costa Ricans: Get to know the requirements

San Jose (Costa Rica), August 2023. Silicon Valley, known as the hub of technological innovation, has become an attractive place for Latin American talents seeking remote work. In an increasingly interconnected world, these leading technology companies provide opportunities for professionals from different nations, especially Costa Rican talents, who have proven to be creative, proactive, and with a constant passion for learning.

Startups have shown a growing interest in hiring remote Costa Rican software engineers, recognizing their significant contribution to the technology field. Local talent is highly valued in the global tech industry due to their mental agility, problem-solving skills, educational background, and passion for their work. These companies embrace cultural diversity and the unique perspectives that talent from the region brings, actively seeking highly skilled professionals.

For Costa Ricans interested in applying for these opportunities, the requirements are demanding but attainable. Proficiency in English at a B2 level according to the Common European Framework of Reference (CEFR) is essential to communicate in international settings effectively. Additionally, knowledge and experience in web interface design, software development, and data analysis are highly valued skills in today’s technology field. Companies also seek professionals with soft skills like teamwork, adaptability, and resilience.

Discover the requirements for remote work from Costa Rica:

English Proficiency: Ensuring global communication. To make headway in the international professional landscape, a minimum B2 level of English proficiency according to the Common European Framework of Reference (CEFR) is a fundamental requirement. This competence ensures efficient communication in global contexts, preparing individuals to tackle worldwide challenges and cultivate relationships that transcend geographic boundaries.

Technical Mastery and Experience: Leading technological change. The tech field demands professionals with broad skills and experience. Those passionate about web interface design, software development, quality assurance, and data analysis have the opportunity to lead change in large-scale projects and create innovative solutions.

Soft Skills: The compass to collaborative success. Beyond technical talent, organizations seek professionals with outstanding soft skills. The ability to work in a team, express oneself effectively, and adapt to challenges in the workplace is essential to excel in the global business landscape. Candidates with a positive attitude and collaboration skills stand out in today’s business environment.

“If you aspire to join the dynamic tech world, it’s essential to dedicate time and effort to improve your technical skills, whether through specialized courses, immersion in relevant content, or constant practice in challenging projects,” says Diego Gamboa, Chief Technology Officer of the software consulting firm Mismo.

He further explains, “Silicon Valley companies appreciate candidates who maintain a strong online presence, sharing projects and contributions on platforms like GitHub and LinkedIn. This kind of engagement demonstrates their passion and commitment to technology and the professional community.”

Working for American industries from Costa Rica offers several attractive benefits for professionals. The primary one is receiving salaries in US dollars, strengthening purchasing power against local currencies and providing economic stability. These startups also offer budget allowances for home office setup, access to quality technology and internet connections, as well as budgets for professional development and continuous education. Additional benefits include flexible working hours, an innovative work environment, and the opportunity to work on global-scale projects.

The opportunities for Costa Rican talent in Silicon Valley are diverse and exciting. These include positions for full-stack engineers and front-end and back-end developers, which are essential for creating technological projects. Similarly, DevOps engineers, experts in infrastructure, are in demand. These engineers, along with data scientists, play a vital role in the product and service development life cycle.

If you are a passionate professional with outstanding technical skills and a proactive attitude, we invite you to explore the available opportunities and apply to become a part of this innovative and ever-evolving community. Those interested in applying for these positions can do so at Mismo