• DBT – Part 2: Tables and Views – Materialisation

    In this blog post, we will explore how dbt empowers analysts to seamlessly transform raw data into analytical insights, facilitating quick decision-making for businesses. The key to this process lies in the utilization of dbt materialisation. In simpler terms, DBT materialisation involves constructing objects through the strategic combination of Views and Common Table Expressions (CTEs)…

    Read more

  • DBT – Part 1: Overview and Environment Setup

    What is dbt? ‘dbt’ or ‘Data Build Tool’ is an open-source platform designed to enable users to articulate data outcomes as models and facilitate data transformation through optimized SQL queries. It is gaining prominence as a robust and efficient tool in the data engineering process. Leveraging dbt allows for the following capabilities: dbt environments dbt…

    Read more

  • Prompt Engineering – Tips and Tricks

    Understanding Prompt Engineering? Recent advancements in the field of AI have highlighted the importance of prompt engineering. The key success of Generative AI is that it has made prompt engineering accessible to everyone, regardless of technical expertise Prompt engineering is more than just typing text and waiting for output; it’s an art that involves combining…

    Read more

  • Cloud services have become an inevitable topic in the roadmap of organizations. In the past few years, many companies have migrated their platforms from on-premises to cloud services. As cloud services are becoming more popular, we can also see many cloud services being released in the market, and there exists healthy competition between cloud providers.…

    Read more

  • Prompt Engineering – Tips and Tricks

    Understanding Prompt Engineering? Recent advancements in the field of AI have highlighted the importance of prompt engineering. The key success of Generative AI is that it has made prompt engineering accessible to everyone, regardless of technical expertise. Prompt engineering is more than just typing text and waiting for output; it’s an art that involves combining…

    Read more

  • In this post, I discussed on few good practices to manage roles, priviliges, and users in Postgres database. In many cases where the databases are administrated by non-database professional, the users are freely created and the priviliges are are provided without any guidelines. At one point of time when the number of users increases it…

    Read more

  • As many of us know Pandas is one of the most used libraries in python programming and is widely used for data analytics, data processing and machine learning data preparations. Pandas create structured tabular data objects with rows and columns. Pandas support the following: Read data from multiple sources like CSV, JSON, SQL Supports heterogeneous…

    Read more

  • For many it has been a long discussion to understand the real functionality and terminology of Data lakehouse. For those who have worked on Data Warehouse and Data Lake, Lakehouse would not be a difficult side to understand and implement. In simple term Lakehouse is a hybrid method of binding Data Warehouse and Data Lake.…

    Read more

  • AWS Glue – Quick Start

    About Glue AWS Glue is a fully managed serverless ETL service provided by AWS Cloud. Glue is the mixture of Apache Spark and Hive Metastore. In my terminology, we can call GLUE as “D3”: Discover, Develop, Deploy. DISCOVER: Automatically discover and categorize your data making it immediately searchable and queryable across data sources. DEVELOP: Generate…

    Read more

  • Amazon Rekognition (ARS) is the Object Detection service provided by AWS cloud service. With ARS we can perform the following functionality: Object and scene detection Image moderation Facial analysis Celebrity recognition Face comparison Text in image In this page, I wish to share my experience/fun I  had with Amazon Rekognition Face Comparison functionality. For this…

    Read more