skills/cloud-azure/azure-storage-file-datalake-py/SKILL.md
Azure Data Lake Storage Gen2 SDK for Python. Use for hierarchical file systems, big data analytics, and file/directory operations.
npx skillsauth add bereniketech/claude_kit azure-storage-file-datalake-pyInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Hierarchical file system for big data analytics workloads.
pip install azure-storage-file-datalake azure-identity
AZURE_STORAGE_ACCOUNT_URL=https://<account>.dfs.core.windows.net
from azure.identity import DefaultAzureCredential
from azure.storage.filedatalake import DataLakeServiceClient
credential = DefaultAzureCredential()
account_url = "https://<account>.dfs.core.windows.net"
service_client = DataLakeServiceClient(account_url=account_url, credential=credential)
| Client | Purpose |
|--------|---------|
| DataLakeServiceClient | Account-level operations |
| FileSystemClient | Container (file system) operations |
| DataLakeDirectoryClient | Directory operations |
| DataLakeFileClient | File operations |
# Create file system (container)
file_system_client = service_client.create_file_system("myfilesystem")
# Get existing
file_system_client = service_client.get_file_system_client("myfilesystem")
# Delete
service_client.delete_file_system("myfilesystem")
# List file systems
for fs in service_client.list_file_systems():
print(fs.name)
file_system_client = service_client.get_file_system_client("myfilesystem")
# Create directory
directory_client = file_system_client.create_directory("mydir")
# Create nested directories
directory_client = file_system_client.create_directory("path/to/nested/dir")
# Get directory client
directory_client = file_system_client.get_directory_client("mydir")
# Delete directory
directory_client.delete_directory()
# Rename/move directory
directory_client.rename_directory(new_name="myfilesystem/newname")
# Get file client
file_client = file_system_client.get_file_client("path/to/file.txt")
# Upload from local file
with open("local-file.txt", "rb") as data:
file_client.upload_data(data, overwrite=True)
# Upload bytes
file_client.upload_data(b"Hello, Data Lake!", overwrite=True)
# Append data (for large files)
file_client.append_data(data=b"chunk1", offset=0, length=6)
file_client.append_data(data=b"chunk2", offset=6, length=6)
file_client.flush_data(12) # Commit the data
file_client = file_system_client.get_file_client("path/to/file.txt")
# Download all content
download = file_client.download_file()
content = download.readall()
# Download to file
with open("downloaded.txt", "wb") as f:
download = file_client.download_file()
download.readinto(f)
# Download range
download = file_client.download_file(offset=0, length=100)
file_client.delete_file()
# List paths (files and directories)
for path in file_system_client.get_paths():
print(f"{'DIR' if path.is_directory else 'FILE'}: {path.name}")
# List paths in directory
for path in file_system_client.get_paths(path="mydir"):
print(path.name)
# Recursive listing
for path in file_system_client.get_paths(path="mydir", recursive=True):
print(path.name)
# Get properties
properties = file_client.get_file_properties()
print(f"Size: {properties.size}")
print(f"Last modified: {properties.last_modified}")
# Set metadata
file_client.set_metadata(metadata={"processed": "true"})
# Get ACL
acl = directory_client.get_access_control()
print(f"Owner: {acl['owner']}")
print(f"Permissions: {acl['permissions']}")
# Set ACL
directory_client.set_access_control(
owner="user-id",
permissions="rwxr-x---"
)
# Update ACL entries
from azure.storage.filedatalake import AccessControlChangeResult
directory_client.update_access_control_recursive(
acl="user:user-id:rwx"
)
from azure.storage.filedatalake.aio import DataLakeServiceClient
from azure.identity.aio import DefaultAzureCredential
async def datalake_operations():
credential = DefaultAzureCredential()
async with DataLakeServiceClient(
account_url="https://<account>.dfs.core.windows.net",
credential=credential
) as service_client:
file_system_client = service_client.get_file_system_client("myfilesystem")
file_client = file_system_client.get_file_client("test.txt")
await file_client.upload_data(b"async content", overwrite=True)
download = await file_client.download_file()
content = await download.readall()
import asyncio
asyncio.run(datalake_operations())
append_data + flush_data for large file uploadsget_paths with recursive=True for full directory listingThis skill is applicable to execute the workflow or actions described in the overview.
testing
AUTHORIZED USE ONLY: This skill contains dual-use security techniques. Before proceeding with any bypass or analysis: > 1.
development
Detects missing zeroization of sensitive data in source code and identifies zeroization removed by compiler optimizations, with assembly-level analysis, and control-flow verification. Use for auditing C/C++/Rust code handling secrets, keys, passwords, or other sensitive data.
development
Comprehensive guide to auditing web content against WCAG 2.2 guidelines with actionable remediation strategies.
development
Expert in threat modeling methodologies, security architecture review, and risk assessment. Masters STRIDE, PASTA, attack trees, and security requirement extraction. Use PROACTIVELY for security architecture reviews, threat identification, or building secure-by-design systems.