skills/laravel-data-chunking-large-datasets/SKILL.md
Process large datasets efficiently using chunk(), chunkById(), lazy(), and cursor() to reduce memory consumption and improve performance
npx skillsauth add noartem/skills laravel-data-chunking-large-datasetsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Process large datasets efficiently by breaking them into manageable chunks to reduce memory consumption and improve performance.
// BAD: Loading all records into memory
$users = User::all(); // Could be millions of records!
foreach ($users as $user) {
$user->sendNewsletter();
}
// BAD: Even with select, still loads everything
$emails = User::pluck('email'); // Array of millions of emails
foreach ($emails as $email) {
Mail::to($email)->send(new Newsletter());
}
chunk()// Process 100 records at a time
User::chunk(100, function ($users) {
foreach ($users as $user) {
$user->calculateStatistics();
$user->save();
}
});
// With conditions
User::where('active', true)
->chunk(200, function ($users) {
foreach ($users as $user) {
ProcessUserJob::dispatch($user);
}
});
// Prevents issues when modifying records during iteration
User::where('newsletter_sent', false)
->chunkById(100, function ($users) {
foreach ($users as $user) {
$user->update(['newsletter_sent' => true]);
Mail::to($user)->send(new Newsletter());
}
});
// With custom column
Payment::where('processed', false)
->chunkById(100, function ($payments) {
foreach ($payments as $payment) {
$payment->process();
}
}, 'payment_id'); // Custom ID column
// Uses PHP generators, minimal memory footprint
User::where('created_at', '>=', now()->subDays(30))
->lazy()
->each(function ($user) {
$user->recalculateScore();
});
// With chunking size control
User::lazy(100)->each(function ($user) {
ProcessRecentUser::dispatch($user);
});
// Filter and map with lazy collections
$results = User::lazy()
->filter(fn($user) => $user->hasActiveSubscription())
->map(fn($user) => [
'id' => $user->id,
'revenue' => $user->calculateRevenue(),
])
->take(1000);
// Most memory-efficient for simple forward iteration
foreach (User::where('active', true)->cursor() as $user) {
$user->updateLastSeen();
}
// With lazy() for additional collection methods
User::where('verified', true)
->cursor()
->filter(fn($user) => $user->hasCompletedProfile())
->each(fn($user) => SendWelcomeEmail::dispatch($user));
class ExportUsersCommand extends Command
{
public function handle()
{
$file = fopen('users.csv', 'w');
// Write headers
fputcsv($file, ['ID', 'Name', 'Email', 'Created At']);
// Process in chunks to avoid memory issues
User::select('id', 'name', 'email', 'created_at')
->chunkById(500, function ($users) use ($file) {
foreach ($users as $user) {
fputcsv($file, [
$user->id,
$user->name,
$user->email,
$user->created_at->toDateTimeString(),
]);
}
// Optional: Show progress
$this->info("Processed up to ID: {$users->last()->id}");
});
fclose($file);
$this->info('Export completed!');
}
}
class SendCampaignJob implements ShouldQueue
{
public function handle()
{
$campaign = Campaign::find($this->campaignId);
// Process subscribers in chunks
$campaign->subscribers()
->where('unsubscribed', false)
->chunkById(50, function ($subscribers) use ($campaign) {
foreach ($subscribers as $subscriber) {
SendCampaignEmail::dispatch($campaign, $subscriber)
->onQueue('emails')
->delay(now()->addSeconds(rand(1, 10)));
}
// Prevent rate limiting
sleep(2);
});
}
}
class MigrateUserData extends Command
{
public function handle()
{
$bar = $this->output->createProgressBar(User::count());
User::with(['profile', 'settings'])
->chunkById(100, function ($users) use ($bar) {
DB::transaction(function () use ($users, $bar) {
foreach ($users as $user) {
// Complex transformation
$newData = $this->transformUserData($user);
NewUserModel::create($newData);
$bar->advance();
}
});
});
$bar->finish();
$this->newLine();
$this->info('Migration completed!');
}
}
class CleanupOldLogs extends Command
{
public function handle()
{
$deletedCount = 0;
ActivityLog::where('created_at', '<', now()->subMonths(6))
->chunkById(1000, function ($logs) use (&$deletedCount) {
$ids = $logs->pluck('id')->toArray();
// Batch delete for efficiency
ActivityLog::whereIn('id', $ids)->delete();
$deletedCount += count($ids);
$this->info("Deleted {$deletedCount} records so far...");
// Give database a breather
usleep(100000); // 100ms
});
$this->info("Total deleted: {$deletedCount}");
}
}
| Method | Use Case | Memory Usage | Notes |
| ------------- | ------------------------ | ---------------- | ---------------------------------------------- |
| chunk() | General processing | Moderate | May skip/duplicate if modifying filter columns |
| chunkById() | Updates during iteration | Moderate | Safer for modifications |
| lazy() | Large result processing | Low | Returns LazyCollection |
| cursor() | Simple forward iteration | Lowest | Returns Generator |
| each() | Simple operations | High (loads all) | Avoid for large datasets |
User::select('id', 'email', 'name')
->chunkById(100, function ($users) {
// Process with minimal data
});
// Ensure indexed columns in where clauses
User::where('status', 'active') // status should be indexed
->where('created_at', '>', $date) // created_at should be indexed
->chunkById(200, function ($users) {
// Process efficiently
});
User::withoutEvents(function () {
User::chunkById(500, function ($users) {
foreach ($users as $user) {
$user->update(['processed' => true]);
}
});
});
// Instead of updating each record
User::chunkById(100, function ($users) {
$ids = $users->pluck('id')->toArray();
// Bulk update with raw query
DB::table('users')
->whereIn('id', $ids)
->update([
'last_processed_at' => now(),
'processing_count' => DB::raw('processing_count + 1'),
]);
});
class ProcessLargeDataset extends Command
{
public function handle()
{
User::chunkById(100, function ($users) {
ProcessUserBatch::dispatch($users->pluck('id'))
->onQueue('heavy-processing');
});
}
}
class ProcessUserBatch implements ShouldQueue
{
public function __construct(
public Collection $userIds
) {}
public function handle()
{
User::whereIn('id', $this->userIds)
->get()
->each(fn($user) => $user->process());
}
}
test('processes all active users in chunks', function () {
// Create test data
User::factory()->count(150)->create(['active' => true]);
User::factory()->count(50)->create(['active' => false]);
$processed = [];
User::where('active', true)
->chunkById(50, function ($users) use (&$processed) {
foreach ($users as $user) {
$processed[] = $user->id;
}
});
expect($processed)->toHaveCount(150);
expect(count(array_unique($processed)))->toBe(150);
});
test('handles empty datasets gracefully', function () {
$callCount = 0;
User::where('id', '<', 0) // No results
->chunk(100, function ($users) use (&$callCount) {
$callCount++;
});
expect($callCount)->toBe(0);
});
Modifying filter columns during chunk()
// WRONG: May skip records
User::where('processed', false)
->chunk(100, function ($users) {
foreach ($users as $user) {
$user->update(['processed' => true]); // Changes the WHERE condition!
}
});
// CORRECT: Use chunkById()
User::where('processed', false)
->chunkById(100, function ($users) {
foreach ($users as $user) {
$user->update(['processed' => true]);
}
});
Not handling chunk callback returns
// Return false to stop chunking
User::chunk(100, function ($users) {
foreach ($users as $user) {
if ($user->hasIssue()) {
return false; // Stop processing
}
$user->process();
}
});
Ignoring database connection limits
// Consider connection timeouts for long operations
DB::connection()->getPdo()->setAttribute(PDO::ATTR_TIMEOUT, 3600);
User::chunkById(100, function ($users) {
// Long running process
});
Remember: When dealing with large datasets, always think about memory usage, query efficiency, and processing time. Chunk your data appropriately!
testing
Decompose large Vue 3 components into focused SFCs and composables with explicit contracts, simple templates, and SSR-safe side effects.
tools
shadcn-vue for Vue/Nuxt with Reka UI components and Tailwind. Use for accessible UI, Auto Form, data tables, charts, dark mode, MCP server setup, or encountering component imports, Reka UI errors.
documentation
Wrap multi-write operations in transactions; use dispatchAfterCommit and idempotency patterns to ensure consistency
tools
Stabilize workflows with Template Method or Strategy; extend by adding new classes instead of editing core logic