Building Python Packages with Azure DevOps

Today, organisations have to manage increasingly complex data pipelines integrating diverse data sources and supporting analytics at scale. As these environments grow, the need for reusable, well-tested code packages becomes fundamental to success, especially across large enterprise projects.

 

In this blog post, we’ll see how to create a scalable Python package (a code package engineered to handle growing numbers of users, data and features without degrading performance or reliability) that integrates well with Azure DevOps. We’ll also look at proven approaches for automating CI/CD, enforcing rigorous testing standards, and publishing packages efficiently. Whilst our example focuses on Python, the principles apply broadly to any programming language.

 

 

The Enterprise Integration Challenge

 

Modern enterprises rely heavily on various cloud services and tools for their daily operations, whether it’s cloud storage (like Google Drive or SharePoint), messaging platforms (like Slack or Teams), databases (like PostgreSQL or Azure SQL), or data processing engines (like Spark or Databricks). In our case, working within the Microsoft ecosystem, we encountered a common challenge: multiple teams were repeatedly developing utilities for different services. Specifically, while implementing alert notifications in Teams, each group developed its own approach, resulting in inconsistent reliability and security standards, slower development, and increased maintenance overheads. Obviously, we needed to standardise these implementations into a single, reliable package.

 

 

Why Azure DevOps Was Our Natural Choice

 

As our entire development lifecycle already lived in Azure DevOps, extending it to handle Python package management was a logical next step. We identified four primary advantages that solidified this decision:

  • Unified Ecosystem: Azure DevOps consolidates source control (Azure Repos), automated workflows (Azure Pipelines), and package distribution (Azure Artifacts) into a single, cohesive platform. This integration streamlined our development process and reduced the complexity of managing multiple tools.
  • Economic Viability: Utilising Azure Artifacts for our private PyPI (Python Package Index) feed was a cost-effective strategy, eliminating the need for separate hosting infrastructure and integrating directly with our existing Azure billing, thus simplifying financial management.
  • Enhanced Security Posture: Azure Artifacts provides essential security features out-of-the-box, like package signing and vulnerability scanning. These capabilities were crucial for maintaining the integrity of our software supply chain and adhering to our security standards.
  • Inherent Scalability: As a cloud-native service, Azure DevOps offers flexibility to grow with our needs. We can manage an extensive portfolio of packages and their versions without being constrained by infrastructure limitations or facing unexpected cost increases.

 

High-Level Overview of Azure DevOps

 

 

Module Structure and Design Principles

 

Robust modular architecture is fundamental to the effectiveness and maintainability of any enterprise Python package. The package described here is organised into modules, with each module addressing a specific set of common challenges typical in data engineering environments across Microsoft platforms, as well as other major ecosystems:

 

Example Python Package Structure

 

Real-World Challenge: Databricks Deployment Bottlenecks

Our experience deploying this package in Databricks environments highlighted the practical impact of architectural choices: when we initially installed the entire package on Databricks clusters, setup times of 6 to 10 minutes per job became a recurring obstacle, slowing down workflows and driving up compute costs. This real-world challenge prompted us to further refine the package’s structure, leading to the adoption of lazy loading and modular installation approaches.

 

Solution 1: Lazy Loading for Faster Imports

One of our most resource-intensive modules was the SharePoint integration component. This module requires several heavy dependencies like azure-identity and msgraph-sdk to interact with Microsoft’s Graph API. Initially, these dependencies were loaded whenever our package was imported, regardless of whether the SharePoint functionality was needed.

 

By implementing lazy loading for the SharePoint module, we ensured these dependencies were only imported when explicitly required. Here’s how we restructured our imports:

 

import lazy_loader as lazy
# Defer the loading of the 'sharepoint' submodule
__getattr__, __dir__, __all__ = lazy.attach(
    __name__,
    submodules=['sharepoint'],
)

 

Impact:

  • 60% reduction in import times.
  • 40% reduction in memory usage.

This optimisation was particularly valuable in Databricks environments where every second of cluster startup time directly impacts cost.

 

Solution 2: Modular Installation with extras_require

We restructured the package to use optional dependencies, or extras, allowing teams to install only the components needed for their specific tasks. This is especially useful in Databricks, where lightweight installations are preferred:

 

# In setup.py
extras_require={
    "sharepoint": [
        "azure-identity",
        "msgraph-sdk"
    ],
    "full": requirements
}

 

This approach allows teams to install only what they need, significantly reducing installation time:

 

# Core functionality only
pip install platform-common

# For scripts needing SharePoint integration
pip install platform-common[sharepoint]

# Install Full library with core and SharePoint
pip install platform-common[full]

 

Key Benefits

  • Reduced installation and startup times: only the necessary modules are loaded, saving time and resources.
  • Lower memory usage: leaner environments mean less overhead and better performance.
  • Increased productivity: faster cluster startups and tailored installations streamline development and operations.

 

 

Building Intelligence into our CI/CD Pipeline

 

Delivering reliable software today means going beyond simple automation. As we advanced our Python package development for Microsoft-centric environments, we embedded smarter logic and stricter quality checks into our Azure DevOps pipeline to make every step more focused.

 

Smart Version Management

One of the most effective improvements we introduced was intelligent version control. Rather than bumping the version number with every commit, our pipeline now checks for actual changes to the core library before deciding whether to increment the version:

 

# Check for changes in the actual library code
$changes = git diff --name-only $lastTag HEAD -- platform_common/

if ([string]::IsNullOrWhiteSpace($changes)){
    Write-Host "No changes detected in platform_common folder. Skipping version increment."
    Write-Host "##vso[task.setvariable variable=shouldPublish;isOutput=true]false"
    exit 0
}

 

For example, if no updates are detected in the main source folder, the process skips the version update and avoids unnecessary builds. This approach has delivered measurable benefits:

  • 30% reduction in unnecessary builds.
  • Cleaner version history with meaningful releases.
  • Reduced noise in our artifact feed.

What’s more, our workflow is structured so that pull requests only trigger automated tests, ensuring code quality before integration. The continuous deployment process, including artifact versioning and publishing, is executed only when changes are merged into the main branch. This separation helps to maintain a clean release process and ensures that only validated, production-ready code is deployed.

 

Zero-Friction Package Publishing

For package distribution, we rely on Azure Artifacts as our internal repository. This decision removed the need for a separate private PyPI server and streamlined the publishing workflow. Packages are released directly from our CI/CD pipeline, and Azure’s security features handle access control. Built-in DevOps tasks, such as authentication and upload steps, make the process straightforward and reliable:

 

- task: PipAuthenticate@1
  displayName: 'Pip Authenticate'
  inputs:
    artifactFeeds: '$(ARTIFACT_FEED)'

- task: TwineAuthenticate@1
  inputs:
    artifactFeed: 'Project/$(ARTIFACT_FEED)'

- script: |
    python -m twine upload -r "$(ARTIFACT_FEED)" --config-file $(PYPIRC_PATH) $(System.ArtifactsDirectory)/dist/*

 

Additional Optimisations for Quality and Reliability

Beyond version management, our pipeline includes several enhancements that are essential for Python projects in Azure DevOps:

  • Automated Testing: every code change triggers a full suite of unit and integration tests, catching issues early on and ensuring stability.
  • Code Coverage Analysis: we monitor test coverage to make sure critical parts of the code are exercised, maintaining high reliability.
  • Static Code Analysis: tools enforce coding standards and identify potential problems before deployment.

Together, these enhancements have made our CI/CD process more efficient and also more robust, supporting the high standards required for enterprise software delivery.

 

 

Conclusion

 

Azure DevOps transformed our Python package development from a manual, error-prone process into an automated, intelligent system that scales with our enterprise needs.

 

For teams working within Microsoft’s cloud ecosystem, Azure DevOps is more than a tool: it’s a strategic advantage that enables better software delivery whilst maintaining enterprise standards for security, compliance, and performance.

 

The investment in building this system has paid dividends in developer productivity, operational reliability, and cost optimisation. Most importantly, it’s enabled our teams to focus on solving business problems rather than reinventing infrastructure solutions.

 

As certified experts in Azure and the Microsoft cloud stack, the ClearPeaks team is ready to help you achieve the same level of success. Whether you’re looking to implement new Azure solutions or optimise existing ones, from DevOps automation and enterprise data engineering to advanced analytics and Databricks integration, we can guide you every step of the way. Reach out to us today!

 

Saqib T
saqib.tamli@clearpeaks.com