项目作者: ondrejbartas

项目描述 :
Scheduler / Cron for Sidekiq jobs
高级语言: Ruby
项目地址: git://github.com/ondrejbartas/sidekiq-cron.git
创建时间: 2013-08-25T09:43:05Z
项目社区:https://github.com/ondrejbartas/sidekiq-cron

开源协议:MIT License

下载


Sidekiq-Cron

Gem Version
CI
codecov

A scheduling add-on for Sidekiq

🎬 Introduction video about Sidekiq-Cron by Drifting Ruby

Sidekiq-Cron runs a thread alongside Sidekiq workers to schedule jobs at specified times (using cron notation * * * * * or natural language, powered by Fugit).

Checks for new jobs to schedule every 30 seconds and doesn’t schedule the same job multiple times when more than one Sidekiq process is running.

Scheduling jobs are added only when at least one Sidekiq process is running, but it is safe to use Sidekiq-Cron in environments where multiple Sidekiq processes or nodes are running.

If you want to know how scheduling work, check out under the hood.

Changelog

Before upgrading to a new version, please read our Changelog.

Installation

Install the gem:

  1. $ gem install sidekiq-cron

Or add to your Gemfile and run bundle install:

  1. gem "sidekiq-cron"

NOTE If you are not using Rails, you need to add require 'sidekiq-cron' somewhere after require 'sidekiq'.

Getting Started

Job properties

  1. {
  2. # MANDATORY
  3. 'name' => 'name_of_job', # must be uniq!
  4. 'cron' => '1 * * * *', # execute at 1 minute of every hour, ex: 12:01, 13:01, 14:01, ...
  5. 'class' => 'MyClass',
  6. # OPTIONAL
  7. 'namespace' => 'YourNamespace', # groups jobs together in a namespace (Default value is 'default'),
  8. 'source' => 'dynamic', # source of the job, `schedule`/`dynamic` (default: `dynamic`)
  9. 'queue' => 'name of queue',
  10. 'retry' => '5', # Sidekiq (not supported by ActiveJob) number of retries, or false to discard on first failure
  11. 'args' => '[Array or Hash] of arguments which will be passed to perform method',
  12. 'date_as_argument' => true, # add the time of execution as last argument of the perform method
  13. 'active_job' => true, # enqueue job through Active Job interface
  14. 'queue_name_prefix' => 'prefix', # Active Job queue with prefix
  15. 'queue_name_delimiter' => '.', # Active Job queue with custom delimiter (default: '_')
  16. 'description' => 'A sentence describing what work this job performs',
  17. 'status' => 'disabled' # default: enabled
  18. }

NOTE The status of a job does not get changed in Redis when a job gets reloaded unless the status property is explicitly set.

Configuration

All configuration options:

  1. Sidekiq::Cron.configure do |config|
  2. config.cron_poll_interval = 10 # Default is 30
  3. config.cron_schedule_file = 'config/my_schedule.yml' # Default is 'config/schedule.yml'
  4. config.cron_history_size = 20 # Default is 10
  5. config.default_namespace = 'statistics' # Default is 'default'
  6. config.available_namespaces = %w[statistics maintenance billing] # Default is `[config.default_namespace]`
  7. config.natural_cron_parsing_mode = :strict # Default is :single
  8. config.reschedule_grace_period = 300 # Default is 60
  9. end

If you are using Rails, you should add the above block inside an initializer (config/initializers/sidekiq-cron.rb).

Time, cron and Sidekiq-Cron

For testing your cron notation you can use crontab.guru.

Sidekiq-Cron uses Fugit to parse the cronline. So please, check Fugit documentation for further information about allowed formats.

If using Rails, this is evaluated against the timezone configured in Rails, otherwise the default is UTC.

If you want to have your jobs enqueued based on a different time zone you can specify a timezone in the cronline,
like this '0 22 * * 1-5 America/Chicago'.

Natural-language formats

Since Sidekiq-Cron v1.7.0, you can use the natural-language formats supported by Fugit, such as:

  1. "every day at five" # => '0 5 * * *'
  2. "every 3 hours" # => '0 */3 * * *'

See the relevant part of Fugit documentation for details.

There are multiple modes that determine how natural-language cron strings will be parsed.

  1. :single (default)
  1. Sidekiq::Cron.configure do |config|
  2. # Note: This doesn't need to be specified since it's the default.
  3. config.natural_cron_parsing_mode = :single
  4. end

This parses the first possible cron line from the given string and then ignores any additional cron lines.

Ex. every day at 3:15 and 4:30

  • Equivalent to 15 3 * * *.
  • 30 4 * * * gets ignored.
  1. :strict
  1. Sidekiq::Cron.configure do |config|
  2. config.natural_cron_parsing_mode = :strict
  3. end

This throws an error if the given string would be parsed into multiple cron lines.

Ex. every day at 3:15 and 4:30

  • Would throw an error and the associated cron job would be invalid

Second-precision (sub-minute) cronlines

In addition to the standard 5-parameter cronline format, Sidekiq-Cron supports scheduling jobs with second-precision using a modified 6-parameter cronline format:

Seconds Minutes Hours Days Months DayOfWeek

For example: "*/30 * * * * *" would schedule a job to run every 30 seconds.

Note that if you plan to schedule jobs with second precision you may need to override the default schedule poll interval so it is lower than the interval of your jobs:

  1. Sidekiq::Cron.configure do |config|
  2. config.cron_poll_interval = 10
  3. end

The default value at time of writing is 30 seconds. See under the hood for more details.

Namespacing

Default namespace

When not giving a namespace, the default one will be used.

In the case you’d like to change this value, you can change it via the following configuration flag:

  1. Sidekiq::Cron.configure do |config|
  2. config.default_namespace = 'statistics'
  3. end

Renaming namespace

If you rename the namespace of a job that is already running, the gem will not automatically delete the cron job associated with the old namespace. This means you could end up with two cron jobs running simultaneously.

To avoid this, it is recommended to delete all existing cron jobs associated with the old namespace before making the change. You can achieve this with the following code:

  1. Sidekiq::Cron::Job.all('YOUR_OLD_NAMESPACE_NAME').each { |job| job.destroy }

Available namespaces

By default, Sidekiq Cron uses the available_namespaces configuration option to determine which namespaces your application utilizes. The default namespace ("default", by default) is always included in the list of available namespaces.

If you want Sidekiq Cron to automatically detect existing namespaces from the Redis database, you can set available_namespaces to the special option :auto.

If available_namespaces is explicitly set and a job is created with an unexpected namespace, a warning will be printed, and the job will be assigned to the default namespace.

Migrating to 2.3

As discussed in this issue, the approach introduced in Sidekiq Cron 2.0 for determining available namespaces using the KEYS command is not acceptable. Therefore, starting from version 2.3, namespacing has been reworked:

  • If you were not using the namespacing feature, no action is required. You can even remove available_namespaces = %w[default], as it is now the default.

  • If you were using the namespacing feature and explicitly specified available namespaces as a list, no changes are needed.

  • If you were using the namespacing feature and relied on automatic namespace inference, you should either specify all used namespaces explicitly or set available_namespaces to :auto to maintain automatic detection. However, note that this approach does not scale well (see the referenced issue for details).

Usage

When creating a new job, you can optionally give a namespace attribute, and then you can pass it too in the find or destroy methods.

  1. Sidekiq::Cron::Job.create(
  2. name: 'Hard worker - every 5min',
  3. namespace: 'Foo',
  4. cron: '*/5 * * * *',
  5. class: 'HardWorker'
  6. )
  7. # INFO: Cron Jobs - add job with name Hard worker - every 5min in the namespace Foo
  8. # Without specifying the namespace, Sidekiq::Cron use the `default` one, therefore `count` return 0.
  9. Sidekiq::Cron::Job.count
  10. #=> 0
  11. # Searching in the job's namespace returns 1.
  12. Sidekiq::Cron::Job.count 'Foo'
  13. #=> 1
  14. # Same applies to `all`. Without a namespace, no jobs found.
  15. Sidekiq::Cron::Job.all
  16. # But giving the job's namespace returns it.
  17. Sidekiq::Cron::Job.all 'Foo'
  18. #=> [#<Sidekiq::Cron::Job:0x00007f7848a326a0 ... @name="Hard worker - every 5min", @namespace="Foo", @cron="*/5 * * * *", @klass="HardWorker", @status="enabled" ... >]
  19. # If you'd like to get all the jobs across all the namespaces then pass an asterisk:
  20. Sidekiq::Cron::Job.all '*'
  21. #=> [#<Sidekiq::Cron::Job ...>]
  22. job = Sidekiq::Cron::Job.find('Hard worker - every 5min', 'Foo').first
  23. job.destroy
  24. # INFO: Cron Jobs - deleted job with name Hard worker - every 5min from namespace Foo
  25. #=> true

What objects/classes can be scheduled

Sidekiq Worker

In this example, we are using HardWorker which looks like:

  1. class HardWorker
  2. include Sidekiq::Worker
  3. def perform(*args)
  4. # do something
  5. end
  6. end

For Sidekiq workers, symbolize_args: true in Sidekiq::Cron::Job.create or in Hash configuration is gonna be ignored as Sidekiq currently only allows for simple JSON datatypes.

Active Job

You can schedule ExampleJob which looks like:

  1. class ExampleJob < ActiveJob::Base
  2. queue_as :default
  3. def perform(*args)
  4. # Do something
  5. end
  6. end

For Active Job you can use symbolize_args: true in Sidekiq::Cron::Job.create or in Hash configuration,
which will ensure that arguments you are passing to it will be symbolized when passed back to perform method in worker.

Adding Cron jobs

Refer to Schedule vs Dynamic jobs to understand the difference.

  1. class HardWorker
  2. include Sidekiq::Worker
  3. def perform(name, count)
  4. # do something
  5. end
  6. end
  7. Sidekiq::Cron::Job.create(name: 'Hard worker - every 5min', cron: '*/5 * * * *', class: 'HardWorker') # execute at every 5 minutes
  8. # => true

create method will return only true/false if job was saved or not.

  1. job = Sidekiq::Cron::Job.new(name: 'Hard worker - every 5min', cron: '*/5 * * * *', class: 'HardWorker')
  2. if job.valid?
  3. job.save
  4. else
  5. puts job.errors
  6. end
  7. # or simple
  8. unless job.save
  9. puts job.errors # will return array of errors
  10. end

Use ActiveRecord models as arguments:

  1. class Person < ApplicationRecord
  2. end
  3. class HardWorker < ActiveJob::Base
  4. queue_as :default
  5. def perform(person)
  6. puts "person: #{person}"
  7. end
  8. end
  9. person = Person.create(id: 1)
  10. Sidekiq::Cron::Job.create(name: 'Hard worker - every 5min', cron: '*/5 * * * *', class: 'HardWorker', args: person)
  11. # => true

Load more jobs from hash:

  1. hash = {
  2. 'name_of_job' => {
  3. 'class' => 'MyClass',
  4. 'cron' => '1 * * * *',
  5. 'args' => '(OPTIONAL) [Array or Hash]'
  6. },
  7. 'My super iber cool job' => {
  8. 'class' => 'SecondClass',
  9. 'cron' => '*/5 * * * *'
  10. }
  11. }
  12. Sidekiq::Cron::Job.load_from_hash hash

Load more jobs from array:

  1. array = [
  2. {
  3. 'name' => 'name_of_job',
  4. 'class' => 'MyClass',
  5. 'cron' => '1 * * * *',
  6. 'args' => '(OPTIONAL) [Array or Hash]'
  7. },
  8. {
  9. 'name' => 'Cool Job for Second Class',
  10. 'class' => 'SecondClass',
  11. 'cron' => '*/5 * * * *'
  12. }
  13. ]
  14. Sidekiq::Cron::Job.load_from_array array

Bang-suffixed methods will remove jobs where source is schedule and are not present in the given hash/array, update jobs that have the same names, and create new ones when the names are previously unknown.

  1. Sidekiq::Cron::Job.load_from_hash! hash
  2. Sidekiq::Cron::Job.load_from_array! array

Loading jobs from schedule file

You can also load multiple jobs from a YAML file:

  1. # config/schedule.yml
  2. my_first_job:
  3. cron: "*/5 * * * *"
  4. class: "HardWorker"
  5. queue: hard_worker
  6. second_job:
  7. cron: "*/30 * * * *" # execute at every 30 minutes
  8. class: "HardWorker"
  9. queue: hard_worker_long
  10. args:
  11. hard: "stuff"

There are multiple ways to load the jobs from a YAML file

  1. The gem will automatically load the jobs mentioned in config/schedule.yml file (it supports ERB)
  2. When you want to load jobs from a different filename, mention the filename in Sidekiq configuration as follows:

    1. Sidekiq::Cron.configure do |config|
    2. config.cron_schedule_file = "config/users_schedule.yml"
    3. end
  3. Load the file manually as follows:

    1. # config/initializers/sidekiq.rb
    2. Sidekiq.configure_server do |config|
    3. config.on(:startup) do
    4. schedule_file = "config/users_schedule.yml"
    5. if File.exist?(schedule_file)
    6. schedule = YAML.load_file(schedule_file)
    7. Sidekiq::Cron::Job.load_from_hash!(schedule, source: "schedule")
    8. end
    9. end
    10. end

Finding jobs

  1. # return array of all jobs
  2. Sidekiq::Cron::Job.all
  3. # return one job by its unique name - case sensitive
  4. Sidekiq::Cron::Job.find "Job Name"
  5. # return one job by its unique name - you can use hash with 'name' key
  6. Sidekiq::Cron::Job.find name: "Job Name"
  7. # if job can't be found nil is returned

Destroy jobs

  1. # destroy all jobs
  2. Sidekiq::Cron::Job.destroy_all!
  3. # destroy job by its name
  4. Sidekiq::Cron::Job.destroy "Job Name"
  5. # destroy found job
  6. Sidekiq::Cron::Job.find('Job name').destroy

Work with job

  1. job = Sidekiq::Cron::Job.find('Job name')
  2. # disable cron scheduling
  3. job.disable!
  4. # enable cron scheduling
  5. job.enable!
  6. # get status of job:
  7. job.status
  8. # => enabled/disabled
  9. # enqueue job right now!
  10. job.enqueue!

Schedule vs Dynamic jobs

There are two potential job sources: schedule and dynamic.
Jobs associated with schedule files are labeled as schedule as their source,
whereas jobs created at runtime without the source=schedule argument are classified as dynamic.

The key distinction lies in how these jobs are managed.
When a schedule is loaded, any stale schedule jobs are automatically removed to ensure synchronization within the schedule.
The dynamic jobs remain unaffected by this process.

How to start scheduling?

Just start Sidekiq workers by running:

  1. $ bundle exec sidekiq

Web UI for Cron Jobs

If you are using Sidekiq’s web UI and you would like to add cron jobs too to this web UI,
add require 'sidekiq/cron/web' after require 'sidekiq/web'.

With this, you will get:

Web UI

Under the hood

When you start the Sidekiq process, it starts one thread with Sidekiq::Poller instance, which perform the adding of scheduled jobs to queues, retries etc.

Sidekiq-Cron adds itself into this start procedure and starts another thread with Sidekiq::Cron::Poller which checks all enabled Sidekiq cron jobs every 30 seconds, if they should be added to queue (their cronline matches time of check).

Sidekiq-Cron is checking jobs to be enqueued every 30s by default, you can change it by setting:

  1. Sidekiq::Cron.configure do |config|
  2. config.cron_poll_interval = 10
  3. end

When Sidekiq (and Sidekiq-Cron) is not used in zero-downtime deployments, after the deployment is done Sidekiq-Cron starts to catch up. It will consider older jobs that missed their schedules during that time. By default, only jobs that should have started less than 1 minute ago are considered. This is problematic for some jobs, e.g., jobs that run once a day. If on average Sidekiq is shut down for 10 minutes during deployments, you can configure Sidekiq-Cron to consider jobs that were about to be scheduled during that time:

  1. Sidekiq::Cron.configure do |config|
  2. config.reschedule_grace_period = 600 # 10 minutes in seconds
  3. end

Sidekiq-Cron is safe to use with multiple Sidekiq processes or nodes. It uses a Redis sorted set to determine that only the first process who asks can enqueue scheduled jobs into the queue.

When running with many Sidekiq processes, the polling can add significant load to Redis. You can disable polling on some processes by setting:

  1. Sidekiq::Cron.configure do |config|
  2. config.cron_poll_interval = 0
  3. end

Testing your configuration

You can test your application’s configuration by loading the schedule in your test suite. Below is an example using RSpec in a Rails project:

  1. # spec/cron_schedule_spec.rb
  2. require "rails_helper"
  3. RSpec.describe "Cron Schedule" do
  4. let(:schedule_loader) { Sidekiq::Cron::ScheduleLoader.new }
  5. let(:all_jobs) { Sidekiq::Cron::Job.all }
  6. # Confirms that `config.cron_schedule_file` points to a real file.
  7. it "has a schedule file" do
  8. expect(schedule_loader).to have_schedule_file
  9. end
  10. # Confirms that no jobs in the schedule have an invalid cron string.
  11. it "does not return any errors" do
  12. expect(schedule_loader.load).to be_empty
  13. end
  14. # May be subject to churn, but adds confidence.
  15. it "adds the expected number of jobs" do
  16. schedule_loader.load
  17. expect(all_jobs.size).to eq 5
  18. end
  19. # Confirms that all job classes exist.
  20. it "has a valid class for each added job" do
  21. schedule_loader.load
  22. # Shows that all classes exist (as we can constantize the names without raising).
  23. job_classes = all_jobs.map { |job| job.klass.constantize }
  24. # Naive check that classes are sidekiq jobs (as they all have `.perfrom_async`).
  25. expect(job_classes).to all(respond_to(:perform_async))
  26. end
  27. end

Contributing

Thanks to all contributors, you’re awesome and this wouldn’t be possible without you!

  • Check out the latest master to make sure the feature hasn’t been implemented or the bug hasn’t been fixed yet.
  • Check out the issue tracker to make sure someone already hasn’t requested it and/or contributed it.
  • Fork the project.
  • Start a feature/bugfix branch.
  • Commit and push until you are happy with your contribution.
  • Make sure to add tests for it. This is important so we don’t break it in a future version unintentionally.
  • Open a pull request!

Testing

You can execute the test suite by running:

  1. $ bundle exec rake test

Using Docker

This project uses Docker Compose in order to orchestrate containers and get the test suite running on you local machine, and here you find the commands to run in order to get a complete environment to build and test this gem:

  1. Build the Docker image (only the first time):

    1. docker compose -f docker/docker-compose.yml build
  2. Run the test suite:

    1. docker compose -f docker/docker-compose.yml run --rm tests

This command will download the first time the project’s dependencies (Redis so far), create the containers and run the default command to run the tests.

Running other commands

In the case you need to run a command in the gem’s container, you would do it like so:

  1. docker compose -f docker/docker-compose.yml run --rm tests <HERE IS YOUR COMMAND>

Note that tests is the Docker Compose service name defined in the docker/docker-compose.yml file.

Running a single test file

Given you only want to run the tests from the test/unit/web_extension_test.rb file, you need to pass its path with the TEST env variable, so here is the command:

  1. docker compose -f docker/docker-compose.yml run --rm --env TEST=test/unit/web_extension_test.rb tests

License

Copyright (c) 2013 Ondrej Bartas. See LICENSE for further details.