Metrics

This package includes a pluggable metrics layer so you can export operational data about job executions (success/failure counts, retries, timing, latency, business KPIs) to any monitoring system (Prometheus, StatsD, Influx, OpenTelemetry, etc.).


Contents

  1. Core Concepts

  2. Built‑in Interface & Default Implementation

  3. What Is Instrumented Out‑Of‑The‑Box

  4. Enabling / Injecting a Collector

  5. Adding Custom Metrics (Examples)

  6. Implementing Your Own Collector (Prometheus Example)

  7. Recommended Naming & Label Conventions

  8. Extension Points & Ideas

  9. Troubleshooting


1. Core Concepts

Instrumentation is intentionally minimal and push‑style: the code paths that care about a metric call a simple collector with:

$metrics->increment('jobs_succeeded', 1, ['queue' => 'default']);
$metrics->observe('job_duration_seconds', 0.352, ['queue' => 'high', 'job' => 'jobs:import']);

Your implementation decides how to aggregate, store, export or flush these values.


2. Built‑in Interface & Default Implementation

Interface: Daycry\Jobs\Metrics\MetricsCollectorInterface

interface MetricsCollectorInterface
{
    public function increment(string $counter, int $value = 1, array $labels = []): void;
    public function observe(string $metric, float $value, array $labels = []): void; // histograms / summaries
    public function getSnapshot(): array; // debugging / tests
}

Reference implementation: InMemoryMetricsCollector (keeps counters and simple histogram aggregations in PHP arrays – good for local dev & tests, not production grade).

Histograms store: count, sum, min, max per unique (name+labels) key.


3. What Is Instrumented Out‑Of‑The‑Box

Job Lifecycle Metrics (RequeueHelper)

Counter Name

When Incremented

Labels

jobs_succeeded

A job finishes successfully

queue

jobs_failed

A job attempt fails (final attempt included)

queue

jobs_requeued

A failed job is placed back on the queue for retry

queue

jobs_failed_permanently

A job exhausts retries and is forwarded to the DLQ (or removed when DLQ disabled)

queue

jobs_dlq_failed

DLQ unconfigured or DeadLetterQueue::store() returned false (silent-loss alert)

queue

jobs_timed_out

A job hit its timeout (pcntl_alarm or fallback path)

job,queue

Queue-Level Metrics (InstrumentedQueueDecorator)

Wrap any queue backend for automatic instrumentation:

Metric Name

Type

Labels

Description

queue_enqueue_total

Counter

backend, queue, status

Total enqueue operations (success/error)

queue_fetch_total

Counter

backend, queue

Successful fetch operations

queue_fetch_empty_total

Counter

backend, queue

Fetch attempts returning no jobs

queue_ack_total

Counter

backend, queue

Job acknowledgments (completion)

queue_nack_total

Counter

backend, queue

Job negative acks (requeues)

queue_enqueue_duration_seconds

Histogram

backend, queue

Time spent in enqueue operation

queue_fetch_duration_seconds

Histogram

backend, queue

Time spent in fetch operation

Additional custom examples (duration, latency, attempts) are trivial to add – see below.


4. Enabling / Injecting a Collector

RequeueHelper accepts an optional MetricsCollectorInterface in its constructor. The recommended (current) way to enable metrics is through the configuration property Jobs::$metricsCollector (see 4.1). The queue worker command (jobs:queue:run) will instantiate that class automatically (zero‑argument constructor) and provide it to the internals. If you set it to null, all metric calls become no‑ops via the nullsafe operator.

If you still need to wire a collector manually (e.g. inside a custom script), you can do:

use Daycry\Jobs\Metrics\InMemoryMetricsCollector;
use Daycry\Jobs\Queues\RequeueHelper;

$collector = new InMemoryMetricsCollector();
$requeue   = new RequeueHelper($collector); // pass this where finalize() is invoked

4.0 Facade Helper

In most internal code you can now simply call:

use Daycry\Jobs\Metrics\Metrics;

$metrics = Metrics::get(); // null if disabled
$metrics?->increment('jobs_custom_metric', 1, ['queue' => 'default']);

This resolves the configured collector once (singleton) and reuses it.

4.1 Configuration Shortcut

You can set a global collector class in config('Jobs')->metricsCollector:

// app/Config/Jobs.php (extends Daycry\Jobs\Config\Jobs)
public ?string $metricsCollector = \App\Metrics\PrometheusCollector::class; // Must implement MetricsCollectorInterface

The jobs:queue:run command will automatically instantiate this class (no arguments constructor) and fall back to the in‑memory collector if misconfigured. Set it to null to disable metrics entirely.

4.2 Disabling Metrics

Set the property to null:

public ?string $metricsCollector = null; // all increment/observe are skipped

This avoids any runtime overhead except a few null checks.

4.3 Custom Constructor Arguments

If your collector needs dependencies, you have two options:

  1. Keep a zero‑argument constructor and resolve dependencies statically/singleton inside it.

  2. Fork or extend QueueRunCommand overriding getMetricsCollector() to build it via your preferred container.

4.4 Worker Integration Points

Current built‑in emission sites:

Location

Metric(s)

RequeueHelper::finalize

jobs_succeeded, jobs_failed, jobs_requeued, jobs_failed_permanently, jobs_dlq_failed

JobLifecycleCoordinator::safeExecuteWithTimeout

jobs_timed_out (both pcntl and fallback paths)

QueueRunCommand::process

jobs_fetched, jobs_age_seconds, jobs_exec_seconds

You can safely add more in your own extended command or PRs.


5. Adding Custom Metrics (Examples)

You can instrument additional points such as execution duration, queue latency, attempt counts, or domain numbers (e.g. rows processed).

5.1 Execution Duration

Inside the lifecycle (e.g. after an ExecutionResult is produced):

$metrics?->observe('job_duration_seconds', $result->durationSeconds(), [
    'queue' => $queueName,
    'job'   => $job->getJob(),
    'success' => $result->success ? '1' : '0',
]);

5.2 Queue Latency (Enqueue → Start)

Store an enqueuedAt timestamp in your envelope when pushing, then at start:

if (isset($envelope->enqueuedAt)) {
    $metrics?->observe('job_queue_latency_seconds', microtime(true) - $envelope->enqueuedAt, [
        'queue' => $envelope->queue,
        'job'   => $job->getJob(),
    ]);
}

5.3 Attempt Counter

$metrics?->increment('job_attempts_total', 1, [
    'queue' => $queueName,
    'job'   => $job->getJob(),
]);

5.4 Domain / Business Metric

If a handler returns JSON encodable output with a field:

// Suppose $result->output = '{"imported":523,"skipped":12}'
$data = json_decode($result->output ?? 'null', true);
if (is_array($data) && isset($data['imported'])) {
    $metrics?->increment('users_imported_total', (int)$data['imported'], [
        'job' => $job->getJob(),
    ]);
}

5.5 Timeout Counter

jobs_timed_out is now emitted automatically by JobLifecycleCoordinator::safeExecuteWithTimeout() whenever the configured timeout fires (both pcntl_alarm path on POSIX and the post-execute time check fallback). The labels are job and queue. You only need to add an extra increment if you implement additional timeout enforcement outside the coordinator:

$metrics?->increment('jobs_timed_out', 1, ['job' => $jobName, 'queue' => $queueName]);

6. Implementing Your Own Collector (Prometheus Example)

Example skeleton using promphp/prometheus_client_php:

use Daycry\Jobs\Metrics\MetricsCollectorInterface;
use Prometheus\CollectorRegistry;

final class PrometheusCollector implements MetricsCollectorInterface
{
    public function __construct(private CollectorRegistry $registry) {}

    public function increment(string $counter, int $value = 1, array $labels = []): void
    {
        $c = $this->registry->getOrRegisterCounter('jobs', $counter, 'Jobs counter', array_keys($labels));
        $c->incBy($value, array_values($labels));
    }

    public function observe(string $metric, float $value, array $labels = []): void
    {
        $h = $this->registry->getOrRegisterHistogram('jobs', $metric, 'Jobs histogram', [0.1, 0.5, 1, 2, 5, 10], array_keys($labels));
        $h->observe($value, array_values($labels));
    }

    public function getSnapshot(): array
    {
        return []; // Not strictly needed; optional for debugging.
    }
}

Expose /metrics endpoint in your app and let Prometheus scrape it.

For StatsD / DogStatsD you would map increment() to statsd->increment() and observe() to statsd->histogram() or timing calls.



8. Extension Points & Ideas

Metric Idea

Source Hook

Execution duration

After each ExecutionResult

Queue latency

Envelope: enqueuedAt vs start

Retry delay distribution

When computing backoff delay

Callback chain depth

In callback dispatch

Payload size bytes

Before execution (strlen json_encode)

Output truncation count

Where maxOutputLength applied

Active single-instance lock

When acquiring / releasing


9. Troubleshooting

Issue

Cause / Fix

Counters always zero

Collector not injected (null). Ensure service wiring.

High memory usage with InMemory

Long‑running worker + many unique label combos. v1.2 caps cardinality at 5 000 entries with FIFO eviction, but if you still see drift switch to a streaming exporter.

Cardinality explosion

Too many distinct job or attempt labels. Trim labels.

Missing jobs_timed_out increments

Job did not actually timeout, OR the timeout fired but a previous PHP version ignored SIGALRM during CPU-bound work. v1.2 enables pcntl_async_signals(true) so this should not occur on PHP 7.1+.

jobs_dlq_failed increments

DLQ unconfigured or push to DLQ failed. Configure Config\Jobs::$deadLetterQueue and verify the destination queue accepts pushes.

Histogram buckets seem coarse

Adjust bucket array in your custom collector implementation.


10. Minimal End‑to‑End Example

$metrics = new InMemoryMetricsCollector();
$requeue = new RequeueHelper($metrics); // now core counters fire

// After executing a job somewhere in your worker loop:
$metrics->observe('job_duration_seconds', 0.91, ['queue' => 'default', 'job' => 'jobs:cleanup']);

print_r($metrics->getSnapshot());

11. Summary

  1. Provide a MetricsCollectorInterface implementation.

  2. Inject it into RequeueHelper (or any other lifecycle component you extend).

  3. Use increment() for discrete counts; observe() for timings / sizes.

  4. Keep label sets small & stable.

  5. Export using your monitoring backend of choice.

Feel free to open issues or PRs if you want deeper native instrumentation hooks.