Gilad David Maayan

Effective Code Documentation for Data Science Projects

Gilad David MaayanFebruary 26, 2024February 25, 2024

Code documentation is a detailed explanation of how the code works. It is a comprehensive guide that helps developers understand and use the code effectively. It is like a manual for your source code, providing information on the purpose of the code, how it is structured, and how it can be modified. Many developers might […]

Machine Learning Techniques for Application Mapping

Gilad David MaayanJanuary 22, 2024January 21, 2024

Application mapping, also known as application topology mapping, is a process that involves identifying and documenting the functional relationships between software applications within an organization. It provides a detailed view of how different applications interact, depend on each other, and contribute to the business processes. The concept of application mapping is not new, but its […]

Building Data Pipelines with Kubernetes

Gilad David MaayanDecember 6, 2023December 5, 2023

Data pipelines are a set of processes that move data from one place to another, typically from the source of data to a storage system. These processes involve data extraction from various sources, transformation to fit business or technical needs, and loading into a final destination for analysis or reporting. The goal is to automate […]

5 FinOps Best Practices You Should Not Ignore

Gilad David MaayanNovember 6, 2023November 5, 2023

FinOps, or Financial Operations, is a relatively new term that has been gaining traction in the business world. It represents a cultural shift in the way organizations manage their finances, especially in the context of cloud computing. FinOps is a collaborative approach that brings together finance, operations, and engineering teams to manage and control cloud […]

Managing a Freelance Data Science Team

Gilad David MaayanOctober 13, 2023October 12, 2023

In this dynamic era, the freelance economy is experiencing an unprecedented boom, significantly reshaping the work landscape. This shift is leading to the increasing prominence of freelance management, which includes sourcing, coordinating, and retaining independent talent in a strategic manner. This article particularly focuses on how to manage a freelance data science team, a trend […]

What Is Metaflow? Quick Tutorial and Overview

Gilad David MaayanSeptember 22, 2023September 21, 2023

As data science continues to evolve, new tools and technologies are being developed to help individuals and organizations streamline their workflows, improve efficiency, and drive better results. One of the most powerful and innovative tools in this space is Metaflow, a Python library that makes it easy to build and manage data science workflows. In […]

Managing Data Costs on Azure

Gilad David MaayanAugust 18, 2023August 17, 2023

As more businesses migrate their operations and data to the cloud, managing costs becomes an increasingly pertinent concern. Microsoft Azure, being one of the most versatile and popular cloud platforms, offers a vast array of data services but also comes with its own set of costs. Proper management of these costs can help businesses leverage […]

What Is GitOps and How Can It Support Machine Learning Operations?

Gilad David MaayanJuly 24, 2023July 21, 2023

GitOps is a way of implementing continuous delivery for cloud native applications. It is based on the idea of using Git as a single source of truth for declarative infrastructure and applications. In GitOps, the desired state of the infrastructure and applications is stored in version control, and an automated process is used to ensure […]

What Is a Feature Store in Machine Learning?

Gilad David MaayanJune 6, 2023June 5, 2023

A feature store is a centralized platform for managing and serving the features used in machine learning (ML) models. A feature is an individual measurable property or characteristic of data that is used as input to an ML model. In order to build effective ML models, it is critical to have high-quality, well-engineered features that […]

Spark vs. Flink: Key Differences and How to Choose

Gilad David MaayanMay 8, 2023May 7, 2023

Apache Spark is an open-source, distributed computing system that provides a fast and scalable framework for big data processing and analytics. The Spark architecture is designed to handle data processing tasks across large clusters of computers, offering fault tolerance, parallel processing, and in-memory data storage capabilities. Spark supports various programming languages, such as Python (via […]