排序
Working with Parquet files in Java using Carpet
Working with Parquet files in Java using Carpet,After some time working with Parquet files in Java using the Parquet Avro library, and studying how it worked, I concluded that desp...
Learning Spark 2.0 Knowledge Dump
Learning Spark 2.0 Knowledge Dump,This post will serve as a continuous knowledge dump regarding the 'Learning Spark 2.0' book, where I'll dump certain quotes that I find relevant (...
Top 10 Common Data Engineers and Scientists Pain Points in 2024
Top 10 Common Data Engineers and Scientists Pain Points in 2024,As we navigate through 2024, the landscape of data engineering and science continues to evolve at a breakneck pace. ...
PySpark & Apache Spark – Overview
PySpark & Apache Spark - Overview,PySpark is Python API for Apache Spark. It enables us to perform real-time large-scale data processing in a distributed environment using python. ...
Working with Parquet files in Java
Working with Parquet files in Java,Parquet is a widely used format in the Data Engineering realm and holds significant potential for traditional Backend applications. This article ...
From Class to Abstract Classes
From Class to Abstract Classes, From Bootstrap to Airflow DAG (11 Part Series) 1 Web Scraping Sprott U Fund with BS4 in 10 Lines of Code 2 The Web Scraping Continuum ... 7 more par...
How I Decreased ETL Cost by Leveraging the Apache Arrow Ecosystem
How I Decreased ETL Cost by Leveraging the Apache Arrow Ecosystem, In the field of Data Engineering, the Apache Spark framework is one of the most known and powerful ways to extrac...
Integrando uma Web API com Datastore Emulator
Integrando uma Web API com Datastore Emulator,O custo elevado do faturamento associado aos projetos do Google Cloud Platform (GCP) é algo que sempre devemos ter em mente durante t...
Introduction to Python for Data Engineering
Introduction to Python for Data Engineering, Yes hello! With increasing interest in data engineering expertise among organizations, we have seen a rise in the demand for data engin...
Data Engineering 102: Introduction to Python for Data Engineering.
Data Engineering 102: Introduction to Python for Data Engineering.,Greetings to my dear readers, today we will be covering about Python for Data Engineering. If you read my article...
ETL Process – The ABC of the DATA Engineer
ETL Process - The ABC of the DATA Engineer,ETL - the process of extracting, transforming and loading data, also called streaming data process, is the foundation of data engineering...
You may not need Airflow…. yet
You may not need Airflow…. yet,TL;DR: Airflow is robust and flexible, but complicated. If you are just starting to schedule data tasks, you may want to try more tailored solutions...