To stop a running job in SLURM, you can use the scancel command. The scancel command is used to send a signal to SLURM to request the termination of a job. Here’s how you can stop a running job:
- Find the Job ID: First, you need to find the job ID of the job you want to stop. You can use the
squeuecommand to list the running jobs and their IDs. Find the job ID of the job you want to stop.
squeue -u <your_username>
- Stop the Job: Once you have the job ID, use the
scancelcommand followed by the job ID to request the job termination.
scancel <job_id>
For example, if the job ID is 12345:
scancel 12345
- Confirm Job Termination: After issuing the
scancelcommand, SLURM will try to stop the job gracefully. The job’s status will change to “CANCELLED” if the termination is successful. - Forcibly Kill the Job (Optional): If the job doesn’t stop gracefully with
scancel, you can use the-9option to forcefully terminate the job. However, this should be used as a last resort, as it may not give the job a chance to clean up properly.
scancel -9 <job_id>
It’s essential to exercise caution when stopping jobs, especially if they are critical or involve important data processing. Always communicate with the job owner if you need to stop their job and ensure it won’t have any adverse effects on their work.