Hello Rosetta community,
I am looking for a recent paper to cluster the output models of Rosetta ligand docking by ligand RMSD (the paper that mentions the detailed command for clustering). I studied several Rosetta papers about ligand docking such as "Assessment and Challenges of Ligand Docking into Comparative Models of G-Protein Coupled Receptor" by Nguyen or "Small-molecule ligand docking into comparative models with Rosetta" by Combs. They used BCL tool from Meiler lab, but the former paper is not recent and the commands used in the supplementary cannot be applied currently (for instance there is no bcl.exe ScoreSmallMolecule command now). The latter also did not mention the detailed commands for ligand clustering.
I appreciate your help.
Thanks for your question. Quite a lot has happened to the BCL since that paper was published! We are working on a large review/manual for BCL 4..0+ functionality, and I will make a special point of looking back at some of these older protocol captures so that I can add them as command-line examples with the updated syntax. The latest versions of the BCL are available on the Meiler Lab website (4.1 for Linux and Windows, 4.0 for Apple; working on getting the Apple ready, currently an issue generating the disk image, but I anticipate it will be done in the near future).
One of the earlier changes with the BCL was consolidation of many comparison metrics into a single application. Instead of 'ScoreSmallMolecule' and a host of others, we now have a single application called Compare that performs these types of metrics. To see which options are available, you can do the following:
That will show the application groups and the available applications.
bcl.exe molecule:Compare --help
That will show the options specific to the Compare application in the molecule application group. For your purposes, I recommend running a command like this:
bcl.exe molecule:Compare <my_ligand_poses_file.sdf> -method SymmetryRealSpaceRMSD -output my_ligand_poses_rmsd.dat -scheduler PThread <n_threads> -bcl_table_format
This will do a pairwise comparison of the coordinates in the SDF <my_ligand_poses_file.sdf>. The comparison method is SymmetryRealSpaceRMSD. Unlike regular RMSD, it does two special things: (1) It accounts for symmetry, so if you have a kekulized carboxylic acid group and in one pose the double bond oxygen is making a hydrogen bond while in the other pose the single bond hydrogen is making a double bond, it is going to know that that is equivalent because of aromaticity; same thing for groups like trifluoromethyl where fluorine atoms that are indexed differently between poses may still be chemically the same. (2) It measures the RMSD in Cartesian space, rather than in ligand conformational space. This is good for docking. If you just generated a bunch of conformers and you wanted the RMSDs between the conformers, you could do SymmetryRMSD. You can see a full list of options with the help button I showed you.
Other things to note - you can use threads to parallelize it by passing some number (e.g. 4) where I have <n_threads>. Also, the "bcl_table_format" flag is necessary if you want to subsequently use the output file with bcl.exe bcl:Cluster
Then all you have to do is take that output file and run the same commands as indicated in the paper for Cluster:
bcl.exe bcl:Cluster -distance_input_file my_ligand_poses_rmsd.dat -output_file my_ligand_poses_rmsd.cluster.dat -output_format Rows Centers -input_format TableLowerTriangle -linkage Average -remove_internally_similar_nodes 3
And that should work. Let me know if you have any issues. Happy to help troubleshoot.