Unleashing the Power of Data: Introducing Apache DevLake

In the dynamic landscape of software development, the never-ending challenge of understanding and optimizing the development process finds a solution in Apache DevLake. As an innovative open-source developer data platform, Apache DevLake is designed to revolutionize how development teams harness data for engineering excellence, elevate developer experiences, and foster community growth.

Our customer faced a significant challenge in gaining comprehensive insights into their software development practices. The fragmented data from different DevOps tools, like Jira, GitLab, and Jenkins, made it challenging to make smart decisions and improve the development process. This data puzzle brought several problems:

Decision-Making Dilemma: Critical decisions in software development require a comprehensive understanding of the entire development lifecycle. The absence of a unified view made decision-making difficult, with only partial insights into the consequences of each move.
Pipeline Optimization: The optimization of the pipeline requires a thorough examination of the interconnected processes. Without a consolidated perspective, identifying bottlenecks, inefficiencies, and areas for improvement became a distressing task.
DevOps Metrics and Performance: In the realm of DevOps, metrics such as deployment frequency, lead time for changes, and stability measures are the compass guiding teams toward success. Fragmented data sources prevented the customer’s ability to measure and improve these critical metrics effectively.

Recognizing the need for a transformative solution, the CloudifyOps team introduced Apache DevLake. Acting as a central hub, DevLake seamlessly ingested, analyzed, and visualized data from across the Software Development Life Cycle (SDLC).

How to Setup DevLake?

To implement a proof of concept for Apache DevLake tailored to your specific use cases, you can install it on your local machines using Docker Compose. The detailed instructions for this setup can be found in the Docker Compose setup documentation.
Alternatively, if your infrastructure is powered by Kubernetes, you can explore the Helm setup option. The Helm setup documentation provides guidance on deploying and configuring Apache DevLake using Helm.

Installation via Helm

Prerequisites

Helm >= 3.6.0
Kubernetes >= 1.19.0

To install the chart with release name devlake, follow these steps:

Step 1: Secure Encryption Key

Start by generating a secure encryption key using a reliable method like OpenSSL. For instance, run the following command to create a 128-character string of uppercase letters:

openssl rand -base64 2000 | tr -dc 'A-Z' | fold -w 128 | head -n 1

Copy the generated string and set the value of the ENCRYPTION_SECRET environment variable by running the following command:

export ENCRYPTION_SECRET="copied string"

Remember, keeping the ENCRYPTION_SECRET safe is crucial, as it encrypts sensitive information such as personal access tokens and passwords in the database.

Step 2: Install the Chart

Install the chart by executing the following commands:

helm repo add devlake https://apache.github.io/incubator-devlake-helm-chart
helm repo update
helm install devlake devlake/devlake --version=0.19.0-beta6 --set lake.encryptionSecret.secret=$ENCRYPTION_SECRET

And just like that, the journey begins. Visit your DevLake from the node port (default is 32001) at http://YOUR-NODE-IP:32001.

Configuration: Tailoring DevLake for Your Needs

Supported Data Sources:

DevLake supports a range of data sources to cater to diverse needs. Among the supported sources are:

Jira
Jenkins
GitLab
Github
Feishu
Bitbucket
Gitee
SonarQube
Bamboo
ZenTao
Tapd
Azure DevOps

Crafting a Client’s Success Story

Let’s dive into a real-world scenario where Apache DevLake becomes a pivotal solution for a customer project involving Jira, Gitlab, and Jenkins.

Step 1 — Add Data Connections

Open the config UI and add the data sources.

Step 1.1 — Add a connection. Configure the endpoint and authentication details to connect to the source data.
Step 1.2 — Add data scopes, such as Git repositories, issue boards, or CI/CD pipelines, to determine what data should be collected. Select only relevant boards. Here, we connect Jira to track issues, Gitlab for the repository, and Jenkins for the pipelines.
Step 1.3 — Add scope config (optional). Define the specific data entities within the data scope for collection or apply transformation rules to the raw API responses.

Step 2 — Collect Data in a Project

Step 2.1 — Create a project. DevLake assesses DORA metrics at the project level.

Step 2.2 — Associate connection(s) with the project. When associating a connection with a project, you can select specific data scopes. All connections linked to the same project will be considered part of the same project for calculating DORA metrics.

Step 2.3 — Set the synchronization policy. Specify the sync frequency, time range, and the skip-on-fail option for your data.
Step 2.4 — Start data collection.

Step 3 — Check the Data in Grafana Dashboards

To view the collected data, click on the “Dashboards” button located in the top-right corner of Config UI.

After configuring your project in DevLake, you can access pre-built dashboards in Grafana. These dashboards provide visualizations and insights for various metrics related to software development. Every datasource/DevOps tool added will have its dashboards and some of the dashboards are prebuilt for the use cases.

To customize the dashboards according to your specific goals and requirements, you can tweak them using Grafana’s features. Additionally, if you prefer to create your dashboards, you have the option to use SQL queries to fetch the necessary data from DevLake referring to the domain layer data schema and SQL examples in the metrics documentation.

DevLake can be used for various use cases such as for Engineering Leads, QA Engineers, OSS Maintainers, etc.

Exploring More Use Cases: Engineering Leads

DORA Metrics

DORA, which stands for “DevOps Research & Assessment,” becomes a focal point for Engineering Leads. This standardized framework focuses on both Velocity and Stability. Within velocity are two core metrics:

Deployment Frequency: Number of successful deployments to production, how rapidly is your team releasing to users?
Lead Time for Changes: How long does it take to commit to the code running in production? This is important, as it reflects how quickly your team can respond to user requirements.

Stability is composed of two core metrics:

Median Time to Restore Service: How long does it take the team to properly recover from a failure once it is identified?
Change Failure Rate: How often are your deployments causing a failure?

2. Engineering Overview
Use Cases: This dashboard is to overview the Git and project management metrics.
Data Source Required: Jira + GitHub, or Jira + GitLab.

3. Engineering Throughput and Cycle Time
Use Cases: This dashboard shows the engineering throughput and cycle time, which helps to identify productivity and bottlenecks of the development process.
Data Source Required: GitHub and Jira (transformation required to tell DevLake what the story_points field is)

4. Engineering Component and File-level Git Metrics Dashboard
Use Cases: This dashboard heavily focuses on components, files, and authors of the repository. This helps us to know how many authors are committed on specific files and also code distribution and commit distribution.
Data Source Required: GitHub, Gitlab, Azure DevOps

Key Achievements:

Unified Data Integration: DevLake seamlessly integrated a variety of our customer’s DevOps data sources including Jira, GitLab, and Jenkins, into a coherent and easily understandable view. The implementation of a standardized data model greatly simplified the integration process, offering our customer teams a consolidated and unified perspective on their entire SDLC from project progress to issue tracking, and continuous integration processes.

Out-of-the-Box Insights: With DevLake, our customer had immediate access to crucial engineering metrics through user-friendly, purpose-driven dashboards. These dashboards, tailored for real-world scenarios, provided actionable insights ranging from code commit patterns to deployment frequencies. Our customer could quickly identify bottlenecks in their deployment process, resulting in a significant reduction in deployment times and improved overall efficiency.

Customizability: Recognizing the distinct needs of each development team, DevLake’s high level of customization proved to be a game-changer for our customer. They were able to tailor the platform to match specific project requirements by seamlessly integrating new data sources, defining custom metrics, and creating personalized dashboards, enhancing their overall productivity and collaboration.

Industry Standards Implementation: DevLake’s commitment to aligning with industry standards, notably incorporating recognized DORA metrics, empowered our customer teams to systematically enhance their DevOps performance. For instance, by adhering to DORA metrics, the customer successfully identified improvement areas in deployment frequency and lead time for changes, resulting in a marked acceleration in their software delivery cycles.

In conclusion, Apache DevLake significantly enhanced how our customer developed software. The platform’s feature of integrating different DevOps data gives teams a clear overall picture of their development work. The ready-to-use insights help our customer teams quickly make decisions based on data, and the platform can easily adjust to each team’s specific needs. The journey with Apache DevLake is a big change. Starting with connecting data sources and creating dashboards, DevLake is making software development better, encouraging teamwork, and inspiring a strong push for excellence. Choosing DevLake isn’t just a decision; it’s a promise for a future where data, smart metrics, and industry rules come together to make software development a huge success.

To know more about how CloudifyOps can partner with your teams in your cloud journey, write to us at sales@cloudifyops.com.

Unleashing the Power of Data: Introducing Apache DevLake

Installation via Helm

Prerequisites

Step 1: Secure Encryption Key

Step 2: Install the Chart

Configuration: Tailoring DevLake for Your Needs

Supported Data Sources:

Crafting a Client’s Success Story

Step 1 — Add Data Connections

Step 2 — Collect Data in a Project

Step 3 — Check the Data in Grafana Dashboards

Exploring More Use Cases: Engineering Leads

Key Achievements:

Services

Solutions

Recent Blogs

Our Services

Our Solutions

Offices

Contact us

Unleashing the Power of Data: Introducing Apache DevLake

Installation via Helm

Prerequisites​

Step 1: Secure Encryption Key

Step 2: Install the Chart

Configuration: Tailoring DevLake for Your Needs

Supported Data Sources:

Crafting a Client’s Success Story

Step 1 — Add Data Connections​

Step 2 — Collect Data in a Project​

Step 3 — Check the Data in Grafana Dashboards​

Exploring More Use Cases: Engineering Leads

Key Achievements:

Services

Solutions

Recent Blogs

Our Services

Our Solutions

Offices

Contact us

Prerequisites

Step 1 — Add Data Connections

Step 2 — Collect Data in a Project

Step 3 — Check the Data in Grafana Dashboards