Hello, everyone I prepared paper_interface_design_pilot_commands. List when I did RIFdocking step 12 Running a pilot job. An error occurs when I run the command: Signal 6 (SIGABRT) means that the process was aborted. This usually means an internal Rosetta error caused by (often) bad inputs, (sometimes) developer error, or (rarely) hardware problems.
Has anyone encountered this error with Rosetta? How can I solve this problem? I'd be grateful if you could give me any help.
Unfortunately, Rosetta crashes much more than it should, and just a SIGABRT doesn't help narrow things down.
Do you have the full traceback from the ROSETTA_CRASH.log file? If you also have the tracer output which was printed that might also help.
The other thing to check with a SIGABRT is if you're running with a queueing system (like SLURM, PBS or Condor) -- if the queueing system canceled your run for some reason, you may get a SIGABRT error, and would have to check the queueing system logs for the reason. (Even if you didn't use a queueing system, it might have been your OS killing the process due to a lack of memory -- you may want to check how much free memory you have.)
Hi, rmoretti. Thank you for your advice. Actually, there is an H5 file in my command line. At first, I thought that Rosetta _ scripts.mpi.linuxgccrelease could not recognize the H5 file, so this error occurred. So I installed Rosetta _ scripts.hdf5.linuxgccrelease, and I still got the same error after running it. So I really don't know what caused the mistake.
And there is the log file and ROSETTA_CRASH.log file. Please give me some advice. Thank you very much.
Best I can tell from the crash log, it looks like there's issues opening your H5 file. I might suggest using a third-party HDF5 viewer/opener and checking to see if the file is corrupt or has othere sorts of formatting issues.
Aside from that, it looks like the log file is from the MPI run, but the crash log is from the HDF5 run. It may help a bit to get the log file from the HDF5 run -- if you could add the option `-out:levels core.indexed_structure_store:Trace protocols.core.indexed_structure_store:Trace` to the command line, that would help in getting extra debugging information.
Hello. I think I found the reason for the error, perhaps because I didn't write the path of H5 file completely. Use /home/dengxj/ss _ grouped _ vall _ all/ss _ grouped _ vall _ all.h5 instead of /home/dengxj/ss _ grouped _ vall _ all/. I only specified the path before, but not the file. So it led to mistakes. Now I've changed the flag file, and it's running successfully. It's not over yet.
But I have another problem. My program is running without stopping, and the log file is being updated, but ROSETTA_CRASH.log file is also generated in the process. How did this happen? Will my results be affected?
The flag file looks like this:
Here are the ROSETTA_CRASH.log files generated in the process.
Normally, the ROSETTA_CRASH.log file should only be generated if the Rosetta program exits with an error. However, depending on the error recovery within the protocol, there's a chance that the crash file gets written, but Rosetta is able to pick things up and continue.
If your program is continuing to run and producing outputs, I wouldn't be too concerned.
(The exact error you're seeing is due to issues interacting with the Dalphaball executable .)
Yes, as you said, I carefully examined the Dalphaball executable and fixed it. Now it has been successful and I have got the correct output. Thank you very much!